Chapter 02-PubMed：The Bibliographic Database

The NCBI Handbook

Journal Selection for Index Medicus ®/MEDLINE ® describes the journal selection policy,criteria, and procedures for data submission.Electronic Data Submission Electronic data submission benefits everyone: publishers, the NLM, and users. For the NLM,it eliminates the tremendous costs associated with entering data by hand. For publishers and users, it means that newly published data appear rapidly and accurately in PubMed. Some publishers are now making pre-publication material available before it is formally published (“ahead of print” or “epub” citations); others are publishing electronic-only journals. By close collaboration with the publisher, the citations for these publications can appear in PubMed on the same day as the article is published.Furthermore, electronic data submission allows publishers to create links from abstracts in PubMed to the full text of the appropriate articles available on their own Web site. This can be achieved using LinkOut (Chapter 17). Both subscribers to the journals and other PubMed users can access the full text according to criteria that are determined by the publishers,increasing traffic to their sites.Although the NLM works with many publishers directly, some publishers contract with commercial data aggregators, companies that prepare and submit the p

ublisher's data to the NLM. Many aggregators also host publisher data on their Web sites.Electronic Data Submission Process All electronic data are supplied via FTP to NCBI in XML format, in accordance with the NLM's specifications (document type definition, or DTD). These specifications can be found in NLM Standard Publisher Data Format document. The document includes information on XML tag descriptions, how to handle special characters (e.g., α or β), examples of tagged records, the PubMed DTD, and a FAQ section for participating or potential data providers. Publishers or other data providers who want to submit electronic data should write to:publisher@ncbi.v.NCBI staff will guide new data providers through the approval process for file submission.New providers are asked to submit test files, which are then checked for XML formatting and syntax and for bibliographic accuracy and completeness. The files are revised and resubmitted as many times as necessary until all criteria are met. Once approved, a private account is set up on our FTP site to receive new journal issues, or in the case of online publications, individual

articles as they are added to the publisher's Web site. We run a file-loading script that

automatically processes the files daily, Monday through Friday at approximately 9:(Eastern Time). The new citations are assigned a PubMed ID number (PMID), a confirmation report is sent to

the provider, and the new citations usually become available in PubMed sometime after 11: the next day, Tuesday through Saturday.

After posting in PubMed, the citations are forwarded to NLM's Indexing Section for

bibliographic data verification and for the addition of subject indexing terms from Medical Subject Headings [MeSH]. This process can take several weeks, after which time completed citations flow back into PubMed, replacing the originally submitted data.

Database Management and Hardware

PubMed is one of the NCBI databases within the relational database management system,Entrez (see Chapter 15). Entrez is a text-based search and retrieval system based on in-house software that uses an indexing system for rapid retrieval of information.

The NCBI Handbook

五章一簿The NCBI Handbook

德美亚3号The NCBI Handbook

The NCBI Handbook

Requests for NCBI services, including PubMed, are first proxied through three load-balanced Dell PowerEdge 1650 servers, each with two central processing units. The proxy servers, in turn, load-balance requests forwarded on to the Web servers for PubMed and other NCBI services.The PubMed Web servers comprise eight Dell PowerEdge 8450 servers. The Dell servers have eight central processing units, 8 GB of memory, and about 300 GB of disk space and run the Linux operating system.The Web servers retrieve PubMed records from two Sybase SQL database servers, which run on Sun Enterprise 450s. To accommodate the data volume output by PubMed and other Web-based services, the NLM has a high-speed connection (OC-3, up to 155 Mbits/sec) to the Internet, as well as a 622 Mbits/sec connection (OC-12) to Internet2, the noncommercial network used by many leading research universities.Indexing PubMed Citation Status and Assignment of MeSH Terms Citations in PubMed are assigned one of three citation status tags that display next to the PubMed ID (PMID) numbers on all PubMed citations. The citation status tags indicate the citation's stage in the MEDLINE indexing process. The three tags are:[PubMed - as supplied by publisher]: This tag is displayed on citations added recently to PubMed via electronic submission from a publisher (which may or may not move on for MEDLINE MeSH indexing).[PubMe

d - in process]: This tag is displayed on citations that have had the first stage of quality review to verify that the journal, date, volume, and/or issue are correct. They will be reviewed for other accurate bibliographic data at the article level (e.g., pagination, authors, article title,and abstract) and indexed, i.e., the articles will be reviewed and MeSH vocabulary will be assigned (if the subject of the article is within the scope of MEDLINE).[PubMed - indexed for MEDLINE]: This tag is displayed on citations that have been indexed with MeSH, Publication Types, Registry Numbers, etc., and have been completely reviewed for accurate bibliographic data. This is an intellectual process of assigning controlled vocabulary terms to describe the contents of the journal article and verifying other aspects of the citation data.Most citations that are received electronically from publishers progress through “in process”

status to MEDLINE status. Those citations not indexed for MEDLINE remain tagged [PubMed - as supplied by publisher]. Citations with “in process” status proceed to MEDLINE status after MeSH terms, publication types, sequence Accession numbers, and other indexing data are added.

All records are added to PubMed Monday through Friday and become available for viewing Tuesday through Saturday. For additional information, please see the NLM Fact Sheet: What's the Difference Between MEDLINE ® and PubMed ®?

The Automatic Computer Indexing Process

The aim of the computer indexing process is to automatically create multiple machine-readable access points that refer to the different components of the journal citations for use when searching PubMed. The citations are loaded into PubMed from both the NLM Data Creation

The NCBI Handbook

and Maintenance System (DCMS) and directly from journal publishers (Figure 1). Both

中国农业生态学报sources are in XML.Figure 1. A schematic representation of PubMed data flow During the computer indexing process, the citation information is broken down into index fields such as Journal Name, Author Name, and Title/Abstract. The words in each of the fields

are checked against the corresponding index (i.e., title words in a new citation are looked up in the Title/Abstract Index). If the word already exists, the PMID of the citation is listed with that index term.

If the word is a new one for the Index, it is added as a new Index term, and the PMID is listed alongside it. (In the first instance that the term already exists, the new term will have only this one citation associated with it; this is how the PubMed indexes grow.)Each PubMed citation is, therefore, associated with several indexes, and in cases similar to the Title/Abstract Index, many different index terms can refer back to a single citation. Likewise,commonly used terms will refer to thousands of citations (the term “cell”, for example, is found in the Title/Abstract of 1,092,124 citations at the time of this writing). The Field Indexes can be browsed by using PubMed's Preview/Index function.

Page 4The NCBI Handbook

时珍国医国药The NCBI Handbook

The NCBI Handbook

新贸易保护主义

How PubMed Queries Are Processed Automatic Term Mapping PubMed uses Automatic Term Mapping to process words entered in the query box by someone searching PubMed. Terms entered without a qualifier, i.e., a simple text phrase that does not specify a search field, are looked up again

st the following translation tables and indexes in a distinct order:1. MeSH Translation Table 2. Journals Translation Table 3. Author Index 1. MeSH Translation Table The MeSH Translation Table contains:•MeSH Terms •Subheadings •See-Reference mappings (also known as entry terms) for MeSH terms •Mappings derived from the Unified Medical Language System (UMLS) that have equivalent synonyms or lexical variants in English •Names of Substances and synonyms to the Names of Substances (now known as Supplementary Concept Substance Names)If the search term is found in this translation table, the term will be mapped to the appropriate MeSH term, and the Indexes will be searched as both the text word entered by the user and the MeSH term:

When a term is searched as a MeSH term, PubMed automatically searches that term plus the more specific terms underneath in the MeSH hierarchy:

2. Journals Translation Table

If the search term(s) is not found in the MeSH Translation Table, the PubMed search algorithm then looks up the term in the Journals Translation Table, which contains the full journal title,MEDLINE abbreviation, and International Standard Serial Number (ISSN):

Page 5

展开剂The NCBI Handbook

The NCBI Handbook

本文发布于:2024-09-21 01:23:32，感谢您对本站的认可！

本文链接：https://www.17tex.com/xueshu/313162.html

上一篇：pubmed_result (2)

下一篇：全髋关节置换患者疼痛评估循证指南质量评价

标签：农业国药贸易

留言与评论（共有 0 条评论）