When reporting the size of the CSD, there are two values commonly quoted.
The number of entries in the CSD; this refers to instances of crystal data reported in published literature, and this will include cases where the same dataset has been reported in two or more publications. The number of hits reported by CCDC software (ConQuest and the CSD Python API) is currently based on the number of CSD entries.
The number of structures in the CSD; this is a count of the number of unique combinations of data collection and refinement model. This number can be seen from our statistics page https://www.ccdc.cam.ac.uk/CCDCStats/
For more information please see the page https://www.ccdc.cam.ac.uk/csd-1-million/csd-one-million-find-out-more/
A blog describing the issue in more detail is available here: https://www.ccdc.cam.ac.uk/Community/blog/countdown_to_1_million/