GlyMDB Summary

As June 2019, the GlyMDB contains 5203 glycan microarray samples collected from the Consortium of Functional Glycomics (CFG). The same lectin with the data from multiple experiments on different glycan arrays (from version 1.0 to 5.2) or under different concentrations is counted as multiple samples. Among 5203 microarray samples, 1849 have protein sequence information available (Table 1 and Figure 1A). We performed BLAST search against all protein sequences from PDB protein structures with sequence similarity greater than 95%, and the number of matched PDB entries are shown in Figure 1B. Since multiple microarray samples can have the same protein sequence and matched to the same PDB files, we removed redundancy. Consequently, there are 541 unique protein sequences included in all microarray samples and they are cross-linked to 1965 unique PDB entries. 771 out of 1965 PDB entries are detected to contain glycan ligands, and the length distribution of the largest glycan ligand in each PDB files is shown in Figure 1C.