This chapter explains all of the menu items. ->Menu->Item is used in this text to denote a menu item Item from the menu Menu. For example menu item Connection from the Manager menu is denoted by ->Manager->Connection
Establishes connection to the server. It is not possible to work with corpora without a connection. It is possible to connect to either a remote server through the Internet/intranet or to launch the server locally. If the Internet connection is not set in the configuration, an automatic connection is performed once Bonito is launched. Otherwise this menu item is automatically selected and it is necessary to enter the user's name and password in the displayed window.
The connection is performed to a remote server, the address of which is in the entry box Server address. User name and password are required.
If you select this option, the server runs locally, that is, in the same computer as Bonito, and the whole manager and all corpora to be used have to be installed in this computer. The user's name and password are ignored. The access to the selected corpora is determined by the permissions of the corpus data files. For the server launch the command in the Server command entry box is used.
The address of the computer which the server is running on e.g. aurora.fi.muni.cz.
User's login and password. These can be different from user's login and/or password on the server. Password characters are not displayed in the entry box.
The command for the local server launch. (e.g. manateesrv "desam susanne" 2>errlog).
Changes the user's password. The old password entry is required and the new password entry is required twice. Password characters are not displayed in the entry boxes.
Redraws the contents of the result box according to the assignment in ->Display->Range).
Displays window with user's options for start-up or saving options for the next start-up. Default option values are set to suit most users. Advanced users may change the following options:
You can select the language of the user interface here. The selected language is the default one for next startups.
Application termination.
Displays the information summary of the selected corpus: its name and additional information from the corpus configuration file, size (number of positions), all attributes and structure tags. The size of vocabulary is displayed (number of entries) for each attribute; that is, the number of different words (types), number of different lemmata, tags etc. -- according to the attribute. For each structure tag the number of its occurrences in the corpus is displayed.
Basic statistical values are computed for given attribute values: the number of occurrences in the whole corpus, number of occurrences of the whole bigram, the mutual information (MI) score and T-score.
Bigram occurrences are computed according to a given span (window). Default values from 1 to 1 mean that the second word is to directly follow the first one. Values from 1 to 5 means the size of a span is 5, i.e. there can be up to 4 positions between the first word and the second one. A minus sign means the inverted order of the word occurrence. Thus from -1 to -1 means that the second word is to directly precede the first one. From -5 to 5 then means that there can be a maximum distance of 5 positions between the words and that the words can be in an arbitrary order.
Creates a subcorpus from the selected corpus according to given conditions. Subcorpora can be used only for queries. Any statistics computation uses the whole corpus.
The name the corpus from which a subcorpus will be created. This is the selected corpus from the main Bonito window.
After a successful subcorpus creation a new corpus will be added to the list of available corpora. The name of the new corpus will be in the form base corpus:subcorpus name. For example, if the current corpus is susanne and we create a subcorpus named press there will be susanne:press in the corpus list.
The name of a structure tag (structure attribute) used for subcorpus. In the subcorpus there will be only positions included in the structure of this name which fulfils the given condition. Subcorpora are typically created from the top structures which represents documents or texts.
This condition restricts attribute(s) of the structure to certain values. For example, in the Susanne corpus the press reportage genre is in texts with names beginning by A. Texts names are stored in the file attribute of the <doc> structure in Bonito. Then the condition for a `press reportage' subcorpus is:
file = "A.."
Displays a list of all available subcorpora and allows to delete a selected subcorpus.
Displays all words from a selected attribute matching a given pattern.
Displays a list of all attributes of the current corpus and allows to change the default attribute which is used for queries without an attribute name.
Performs an evaluation of the given query (or P-filter, N-filter, or collocation, see Chapter 4). The evaluation itself is performed on the server: Bonito only receives the result. When the function is activated, a Stop button appears to the right of the corpus name. This button enables the current action to be canceled.
Displays the list of named queries. A user can select a query for an evaluation or delete a selected query from the list.
Displays the list of templates (see Chapter 5 -- Templates). A user can select a template for an evaluation, delete a selected template from the list, add a new template or change the text or the description of a selected template.
In the pop up window it is possible to prepare a query using a graphic interface without using query syntax.
Adds a new template to the list of templates.
name of the template -- it will be used in queries
the template text itself
optional template description
After a template file selection
(e.g. from another user) it adds extra templates to the list of
templates. The default templates file extension is tpl
.
Exports templates to the file.
After a query file selection
(e.g. from another user) it adds new named queries to the list of
named queries. The default queries file extension is
qry
.
A history file (history.qry
) can be also imported, in this
case only named queries are used.
Exports named queries to the given file.
Displays a concordance summary: the corpus name, the concordance size and a list of actions (beginning by the initial query) used for the concordance creation.
Saves the selected rows of the concordance list to a file.
Choose the encoding of the saved lines (value `-' means the server's corpus native encoding).
Selection of the header format for saving information about a query.
Saves currently displayed lines only.
Saves all lines of the concordance list.
If checked the individual lines will be numbered in the output file.
If checked the key words will be aligned one below the other.
The Context button enables changing the context for saving. This context is implicitly the same as the context for displaying (see ->View->Context). After pressing the Save button it is necessary to name the file being saved.
Prints the selected lines of the concordance to a printer. The same information is required as for the ->Concordance->Save to file.
Deletes selected lines from the concordance list (see ->Select menu: how to select/highlight a line). The number of selected lines is displayed in the status line in the bottom border of the main window. Depending on the current range of displayed lines, it is possible that some of the selected lines are not displayed.
Reduce the number of lines in the concordance list. Specify which lines are to remain in the result and how many lines, percent or hundredth of a percent from the initial number of lines are to remain.
Sorts the concordance list according to KWIC, left or right context and selected options.
how many positions to sort
Specifies which positions from the line will be compared during the sort. Assume Number of positions to sort = 3 in the following schema. Number 1 means the most important position (sorts on this position and resolves ties by sorting higher levels), number 3 means the least important position.
........ <KWIC KWIC> ............ left context 3 2 1 < > KWIC from left <1 2 3 > KWIC from right < 3 2 1> right context < > 1 2 3
Ignores case when sorting.
Individual words will be sorted from the last character to the first one.
A positional attribute by which the sort will be performed. It is possible to choose from the positional attributes of the corpus.
Sorts the concordance list according to one or more given conditions. Every condition determines one position according to which the individual lines will be compared.
If checked, there remains only one line in the result for all lines whose sort intervals match.
Adds another sort condition.
Deletes the selected sort condition. Selection is made with mouse.
Executes the sort.
Closes the window without executing any sort.
The number of the position which will be compared. Negative numbers mean positions before the chosen boundary, positive numbers mean positions after it. (See also ->Corpus->Statistics.)
Determines the boundary beginning in the same way as in the filter or collocation specifying (see Chapter 4 -- Queries)
Selection from corpus positional attributes.
Ignores case when sorting.
Individual words will be sorted from the last character to the first.
Sorts concordance lines according to an average frequency of words in the given context. After the sorting, the first lines in the concordance contain the most frequent words.
Sorts concordance lines according to group numbers. Groups are sorted in ascending order (the first group has number 1), at the end of the concordance there are lines without group assignment.
Undoes the last change. It displays the previous concordance. A user can cancel the last change (reduction, deletion, P/N-filtering) or sorting. Undo can be used repeatedly. The maximum number of undo steps can be changed in ->Manager->Settings: the default value is 5.
Redo the last undone change.
Gives a name to the current concordance list. If you would like to use the results of a query again you can assign a name to a concordance and go directly to this concordance without a repeated query evaluation. Named concordances are directly accessible from ->Concordance->Named.
The list of all the named concordances is displayed and the requested concordance can be deleted by selecting from the list and pressing the Delete button.
Counts the frequency of words or other attributes, or their sequences in the requested positions.
Only sequences with frequency higher than the entered limit will be included in the result. The default limit of 0 means that all values will be counted. For concordance lists with a large number of lines, the full result can mean a large amount of data being passed from the server and sorted.
Displaying them can take a long time if more than several thousand lines are to be displayed, depending on the computer performance.
Every condition contains:
The attribute name (selected from the corpus positional attributes)
The position which will be compared.
It is possible to further change the way the results are displayed. This can be done using the following controls:
Lines with a frequency less or equal to the entered limit are not displayed in the result. The number of displayed lines is always counted and shown alongside.
There are three possibilities for each entered position to be chosen:
Words will be displayed in the normal way.
Words will be displayed and their subtotal will be displayed. For the last position, the options Show and Sum are identical because the sum is always counted for that position.
Words will not be counted or displayed at all.
Computes the most important collocations in the given context according to the following parameters:
The attribute name: selected from positional attributes
The initial or terminal context position. Positive values are counted to the right from the end of KWIC, negative values are counted from the beginning of KWIC to the left.
Statistics will be counted only for the words whose total frequency in the corpus is higher than the entered frequency.
Statistics will be counted only for the words whose total frequency in the given context is higher than the entered frequency.
If there are more lines in the result, only the given number of the highest scored is displayed.
Determines the type of sort according to which the result lines will be displayed. This applies only to the selection of the most frequent lines (see previous parameter): displayed lines can be then sorted according to an arbitrary column.
The absolute frequency sort is similar to the T-score and the relative frequency sort is identical with the MI-score (see details below).
The result is displayed in table form. The table can be saved to file by pressing the Save button. The table can be sorted according to an arbitrary column if you right mouse click on the header of the required column.
The meaning of the values in the individual columns follows:
The first column is named according to the name of the counted attribute (e.g. word). It contains the values of the given attribute for which the statistics were counted.
The mutual information of a word and a concordance.
The T-score of a word and a concordance
The relative frequency of a word, i.e. the percentage of all the occurrences of the word in the given context.
The absolute frequency of a word, i.e. how often a word occurred in the given context.
A right mouse button click on a word in the first column displays a menu containing two items: P-filter and N-filter. An activation of one of these items apply the appropriate filter on the current concordance.
A window is displayed in which the number of lines of result (frequency) and the so-called reduced frequency can be seen. Further it shows a graphical representation of the distribution of the individual result lines within the whole corpus. Axis X shows the individual corpus positions, and Y shows the number of occurrences in the given position in corpus.
If the individual lines of a concordance list are evenly distributed within the whole corpus, the individual lines in the graph are of the same length and are displayed evenly along the whole window length. If, conversely, most lines are from ``one'' part of the corpus (e.g. from one document) there are distinctly more longer lines in one part of the window.
It is possible to ``jump'' to the selected part of the corpus by clicking on a line under the mouse cursor.
In this window you select which references should be displayed for the individual rows. If you choose Token number, a token number of the KWIC beginning is displayed. If you select the name of a tag (for example doc) the order number of the KWIC respective tag is displayed (for example doc#2 means the KWIC is the second document from the beginning of the corpus). If you select the name of a tag attribute (for example doc.file), values of this attribute will be displayed, e.g. doc.file=A03 means that the given KWIC is in a document where the attribute file has the value A03). References are displayed in green at the beginning of each row.
In this window the user can select which attributes (e.g. lemma, tag) will be displayed.
Selected attributes will be displayed only for KWIC positions.
Selected attributes will be displayed for KWIC positions and also for all positions in the displayed context.
Specifies which structure tags (structure attributes) will be displayed. If a structure tag contains attributes (e.g. sentence identifiers id of <s> tag), it is possible to tick the selected attribute. Then the values of this attribute will be displayed inside the relevant tag (e.g. <s id=12/3>).
Specifies in which context the words will be displayed. A character, a position or an arbitrary structure tag can be a unit for both the left and the right side. If a character is a unit, whole words are displayed in such a way that at least the entered number of characters is displayed.
Determines which lines will be displayed and the number of displayed lines. If the entered number of lines is 0, all rows are displayed.
The previous/following page is displayed. The number of rows per page can be changed (->Manager->Options), the default value is 20.
Displays the specified range of lines (see ->View->Range) starting with the specified line. The number of the line from which the lines are displayed is shown in the status line before a `+' sign.
Selected lines can be stored in the clipboard or deleted from the concordance list (see ->Concordance->Delete selected). Selected lines are displayed against a blue background. The number of selected lines is shown in the status line at the bottom of the main window.
Lines can be selected with the mouse or the keyboard. The left mouse button or the space bar selects an unselected line or cancels the selection of a selected line. Shift+left mouse button selects all the lines between the given line and the current line.
Selects all lines
Cancels any selection
Selects non-selected lines and vice versa.
Puts the selected lines in the clipboard for copying to other applications.
Passes selected lines to the corpus editor, CED. This function is available only on the UNIX platform.
Shows a window with brief application information and the Manatee version number.
Shows a license window.
If selected and the mouse cursor stops for a while a short description is displayed for a GUI widget (button, entry-box etc.) under the mouse cursor.
Launches on-line documentation in a web browser.