Having to use 0CFA - in other words, no duplication at all of either class types or functions - during the first iteration will become terribly slow for larger programs, but worse, being so imprecise it makes life hard for IFA, causing it to have to split huge contours (consider a single contour for list for larger programs!), making it slow and requiring many iterations.
There are multiple interesting ways of combating this problem. We can selectively apply CPA during the first iterations, by limiting the sizes of cartesian products. Of course fully applying it would make the first iteration even more slow, but selectively applying it would probably not increase precision by much. I think what is really needed is to provide IFA with an initial 'guestimate' of the set of object contours that it should deduce. This does not work for just CPA, because CPA is not data-adaptive, and it would pollute itself if something 'unexpected' happens.
How to get an initial good set of object contours? Of course, heuristics can be applied. A lot of them, especially ugly ones. For example, a [1, 2] constructor could be grouped under a 'list(int)' contour. This becomes messy for more general, and polymorphic constructors - what to do with an [x, y] constructor?
I think a better, and more elegant, solution would be to perform profiling on the program that is to be compiled. Of course, if the program has unit tests, this would be a great match - but that's beside the point. It should be very simple to dump the actual types for each allocation site, possibly even relative to the actual argument types of the current function invocation. This can give us a great starting point. The dumps do not have to be perfect, because IFA will adjust to any imperfections. Best of all, we can enable CPA during all iterations, greatly improving IFA and reducing the number of iterations.
No, I don't mind that people will have to profile their applications. Especially if a simple technique like this turns out to greatly improve scalability.
BTW profiling has another advantage: it can provide type distributions at dynamic calling sites. In case of polymorphic inline caches, we can move the most used types up the ladder.