Driftwood
Larger photosite does increase full well capacity, but the readout noise also increases, bining 4 photosites into one quadruples readout noise. simple as that.
I think this is the end of the discussion, pixel bining has its compromises and won't be seen in high-end cinema cameras.
I'm aware of 2x2 pixel "summing", for cinema cameras it's a technical compromise, Sony F5/F55 for example, when shooting over 60fps in 2K, sensor switch to pixel binning mode to for faster readout, the quality drops significantly. That's why A7S/A7R II were marketed as no pixel binning and full sensor readout, all scaling is done in subsequent signal processing, the SNR/latitude increase is the same (bin from 4K to 2K vs downsample from 4K to 2K).
I'm aware of 2x2 pixel "summing", for cinema cameras it's a technical compromise, Sony F5/F55 for example, when shooting over 60fps in 2K, sensor switch to pixel binning mode to for faster readout, the quality drops significantly. That's why A7S/A7R II were marketed as no pixel binning and full sensor readout, all scaling is done in subsequent signal processing, the SNR/latitude increase is the same (bin from 4K to 2K vs downsample from 4K to 2K).
Software resizing works the same way, DXOMark normalise their camera test file to 8MP, which is why D810 has so much DR in "print" score.
Pixel binning? you know what "bin" means right? throwing away pixels, for example A7R II does not bin any pixels, instead it does full readout and resamples it down to 4K/HD.
That's incorrect, dynamic range for cinema cameras usually reflects the highest resolution that particular camera can capture, eg. Alexa classic has 14+ stops of DR in 3K raw mode. Sony F65 has 14+ stops of DR in 8K raw mode (not full 8K tho).
when you properly downsample the image, each time you half the amount of pixel increases SNR by 6dB (1 stop), that means when you scale a 4K image down to 2K (with good algorithm), the shadow becomes 2 stops cleaner. That's why KineMAX 6K has "Golden 3K" mode which claims 16 stops of DR.
CRI is a measurement based on human perception, it doesn't tell a lot about how camera sensor sees light. That's why BBC developed "TLCI".
Also "Ra" does not take into account new colour patches added which contain desaturated colours, I believe the complete CRI standard is called R96a, or CRI Extended.