Nearly every genealogical researcher and genealogical software program deals with genealogical data at the person level. A researcher finds some piece of data and makes some type of conclusion that the data is for a specific person. Thus, the data is applied to a specific person.
Genealogical data is MUCH more than just information about individual persons. Genealogical data is built upon events found in source documents (e.g. birth certificate, family bible, diploma, tombstone, photograph, obituary, etc.).
An Event is the combination of:
document type: birth, death, burial, graduation, license, etc.
point in time: some type of calendar reference
geographical place: city, state, country, building, cemetery, etc.
names: full names of all persons and/or entities (e.g. organizations, companies, etc.) involved
roles played by each person involved in the event: newborn, doctor, deceased, minister, witness, bride, plaintiff, etc.
This may sound incredibly crazy to you, but each source document and event combination should be treated as a separate piece of data (a.k.a. an extraction). It should NOT be directly associated with a specific person.
Each extraction should be tied to another extraction via some conclusion (a.k.a. assertion) as determined by the researcher. And, the justification of this assertion (i.e. what basis/facts allowed it to be determined) should be duly recorded. This allows other researchers to see how the original researcher made the conclusion.
To get a better understanding of how genealogical data is built from layers, look at the below pyramid. The base of all genealogical data, of course, is source documents. The information must be found somewhere, even if it is someone else's memory. Each layer built upon the source document is how each researcher deals with the data. Eventually the layers produce persons and family units.
The combination of persons and families can be used to generate reports, lists and books to explain the family history to other persons. If all is documented properly, then anyone who wishes will know where to find all source documents plus know how all conclusions/assertions were made by previous researchers.
To see a detailed document that was created by a committee of professional genealogists about this subject, click here [GenTech data model].
Click here [GDMUML] to read a discussion about Genealogical Data Models in UML (Unified Modeling Language).
Recap of the definition of each of the data layers in the pyramid:
Source Documents: they are the foundation for all genealogical information, especially primary documents. (click here [record types] to see a separate section for a list of document types) Examples: a birth certificate, a college diploma, a photograph, an obituary, a city directory listing.
Extraction: each piece of information is separately extracted from a Source Document. Each extraction usually contains a data type (e.g. birth, will, graduation, etc.), date, place, and name of persons/entities involved. Each extraction stands on its own, separate from all other extractions. Extreme care should be taken to record the data EXACTLY as it is found in the source document. Special notes by the researcher can be added but should be clearly indicated that the addition was made by the researcher. Examples: birth, christening, job held, burial.
Assertion: through the use of some type of implicit or explicit proof (a.k.a. a conclusion or an assertion) two Extractions are declared to be for the same individual. The justification as to HOW the research arrived at the assertion should be recorded. Examples: a birth certificate and a marriage certificate, a birth certificate and a burial record, a marriage record and a land transaction.
Person: through the use of some type of proof and/or conclusion, two or more Assertions for the same person are used to build a dossier or profile of one individual – a Person. Most genealogical software begin at this level for the data entry. The justification as to HOW the research arrived at the conclusion should be recorded.
Relationship: through the use of some type of proof or conclusion, two Persons are declared to be related in some way (e.g. parent-child, husband-wife, brother-sister, neighbors, business associates, etc.). The justification as to HOW the research arrived at the conclusion should be recorded.
Family: by virtue of their Relationships, two or more Persons are determined to be part of the same family.
List/Book/Report: using selected Persons and their Relationships in Families, a multitude of reports and lists, including entire books, can be generated. They can then be distributed to other persons, libraries, genealogical societies and web sites.
Justification: this is the list of reasons/justifications/conclusions used by researchers to specify that two Extractions are for the same individual, or, specify the joining of two Assertions for the same Person, or, specify the joining of two Persons in a Relationship.
Note: when you do the exercise in the section with the example pedigree, it has five extractions (3 births and 2 deaths) and three relationships (1 marriage and 2 parent-child).