Molluscan Shell Matrix Proteins
The database contains fasta sequences from UniProt and associated metadata for molluscan shell matrix proteins (SMPs). The database only contains SMPs that have been experimentally validated to be present in molluscan shell matrices (based on the publication(s) attached to the UniProtID). Metadata includes information on functional domains present in the sequence, as detected by InterproScan.
With the advent of Next Generation Sequencing technologies, it is computationally resource intensive to run sequence similarity algorithms on all published data. Moreover, it is impractical to sort through hundreds of sequence similarity search results when working with non-model organisms, since pre-established functional annotations of sequences are generally not available. Therefore, this database was created in order to provide a targeted molluscan biomineralization dataset for sequence similarity algorithms (such as BLAST).
Database created as part of doctoral research, funded under Marie Curie Innovative Training Networks (ITN) - Calcium in the Changing Environment (CACHE - Grant agreement 605051).
Simple
- Alternate title
- Polar Data Centre (PDC) record GB/NERC/BAS/PDC/01132
- Date (Publication)
- 2021-01-08
- Identifier
- http://www.antarctica.ac.uk/dms/metadata.php?id= / GB/NERC/BAS/PDC/01132
- Maintenance and update frequency
- unknown Unknown
- Keywords
-
- NDGO0001
- NERC OAI Harvesting
-
- NERC_DDC
- GCMD Parameter Valids
-
- EARTH SCIENCE > Biosphere > Animal Taxonomy > Mollusks
- BAS Free-text keywords
-
- Biomineralization
- Molluscs
- SMPs
- Shell Matrix Proteins
- Shell formation
- Use limitation
- Data released under Open Government Licence V3.0: http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/.
- Access constraints
- otherRestrictions Other restrictions
- Other constraints
- Data released under Open Government Licence V3.0: http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/.
- Metadata language
- EnglishEnglish
- Topic category
-
- Biota
- Begin date
- unknown
- End date
- unknown
- Reference system identifier
- OGP / urn:ogc:def:crs:EPSG::4326
- Distribution format
-
- Hierarchy level
- dataset Dataset
Domain consistency
- Measure identification
- INSPIRE / Conformity_001
Conformance result
- Date
- Explanation
- See the referenced specification
- Pass
- No
- Statement
- Database is based on uploaded Unioprot IDs entries until JULY 2018. No major publications have since been released that match the criteria to be included in the database. Missing values are only present in the domain columns - A missing value indicates that there were no functional domains detected in the sequence, based on InterproScan results from Interpro databases.
- File identifier
- GB_NERC_BAS_PDC_01132 XML
- Metadata language
- EnglishEnglish
- Hierarchy level
- dataset Dataset
- Date stamp
- 2021-01-08
- Metadata standard name
- NERC profile of ISO19115:2003
- Metadata standard version
- 1.0