The goal of the annotation is to create SCUs that represent the content of the model summaries, where each SCU is a relatively small information unit (e.g., labelled by an atomic sentence, and with contributors that are no larger than a clause), and where the SCUs can be used easily to annotate previously unseen peer summaries. We recommend thinking of an SCU as a single, standalone proposition, or a simple sentence. While this is not always possible when complex ideas are expressed, e.g., involving a relation between two propositions, it facilitates finding SCUs that cover more of the model summaries, and it facilitates peer annotation, when an SCU is a simple proposition.
Open the DUCView annotation tool. Load a file containing the model summaries. Select spans of text to create new SCUs or add to existing SCUs until most text has been selected. Closed class words, or other words with little semantic content can remain unselected. Save the pyramid file.
General criteria to apply:
- You might begin by breaking up the first sentence of one of the summaries into simple clauses or clause-like units (e.g., gerundive phrases)---potential contributors---then looking for matches for each potential contributor in the other summaries. Or, you might begin by breaking up an entire summary into potential contributors before you look for matches. However you proceed, you should aim for each contributor to be no larger than a single atomic sentence. This will not always be possible: capturing a causal relation, for example, could require three SCUS: one for the antecedent event, one for the consequent event, and one for the causal relation.
- Label every SCU as clearly as possible, with a full sentence. See Labeling Guidelines.
The annotation tool makes it easy to select any one of the contributors as the default label, or to edit the label. Please use this facility.
- Select new contributors that are as similar as possible in meaning to those already selected, for a given SCU.
The annotation tool makes it easy to create discontinuous contributors, and to re-use words and phrases in multiple contributors. You can use the "Add contributor" button to append words and phrases to an existing contributor in order to create a discontinuous contributor. For example, if a relative clause is a contributor, you might also select the head word/phrase that the clause modifies to make the proposition explicit. Thus the mapping from a word in a summary to SCUs is many-to-one.
- Be careful to ensure that no summary has multiple contributors in a single SCU.
This applies especially to SCUs of weight less than four. The mapping between contributors and SCUs is one-to-one. Note that the tool will enforce the constraint that no SCU has more contributors than there are model summaries (four in this case) as long as you have entered a "Document Header RegEx".
- Do not be overly concerned with differences in tense or modality across summaries; since the summaries derive from clusters of news articles that may cover many days or weeks, different summaries can refer to the same event as being in the past or the future.
- Be especially careful when an SCU has only one contributor.
There will be a large number of SCUs with only one contributor, but these should be broken down sufficiently that decisions about peer annotation will be relatively clearcut. The annotation tool makes it possible to collapse the SCU list so that you can scan the labels more easily. Also, you can use mouse drag-and-drop to move SCUs together. For example, it might serve as a memory aid to group SCUs together that are about the same entity. See the options menu.
- Be careful not to have multiple SCUs that are almost identical in meaning,
unless it is just this meaning difference that you intend to capture.
As the number of SCUs in your pyramid grows, it becomes more likely that you will lose track of content you have already captured in an SCU. In case this happens, the annotation tool makes it possible to merge SCUs using mouse drag-and-drop. See the options menu.
- Distinguish between general and specific. Frequently, the same information is expressed at different levels of abstraction. A summary might express a general statement, and one or more specific examples of the statement might occur in the same summary or in other summaries. You should aim for each SCU to capture the same level of abstraction, e.g., one SCU for the general statment, and an SCU for each specific example. For example, Proponents of evolutionary theory use a plethora of evidence, including fossils and carbon-dating would yield three SCUS: one for the general statement that "proponents of evolutionary theory use many sources of evidence," and one for each specific example of evidence ("proponents . . . use fossils as evidence," "proponents . . . use carbon-dating as evidence").
Back to Beginning