What is the Puffin file format, how does it relate to data sketches and its role in join optimization?