Chain frames required to satisfy all 2K declared I/Os of 128KB each take more then a megabyte of a physical memory, all of which existing code tries allocate as physically contiguous. This patch removes that physical contiguousness requirement, leaving only virtual contiguousness. I was thinking about other ways of allocation, but the less granular allocation becomes, the bigger is the overhead and/or complexity, reaching about 100% overhead if allocate each frame separately.
The patch also bumps the chain frames hard limit from 2K to 16K. It is more then enough for the case of default REQ_FRAMES and MAXPHYS (drivers will allocate less than that automatically), while in case of increased MAXPHYS it will control maximal memory usage.