Creation and appropriate use of passport, characterization and other databases for the ex situ collections should be among the priorities of any genebank. Value of any collection strictly depends on the completeness of information about each accession. An item (accession) of a collection in any genebank is a plant form which must be registered and precisely identified. The problem of conservation and identification of global PGR is currently becoming the major one for many genebanks of the world (Steiner et al., 1997; MacKenzie, 1998; Diederichsen et al., 1999). Most genebanks that were set up in the mid-60s and 70s of the 20th century face the problem of accessions regeneration after longterm storage for 20 to 30 years. At the same time, many of them lack strict criteria for accessions identification at the level of a species, to say nothing of their capability to maintain intraspecific diversity.
The criteria are based on clearly visible and easily distinguishable morphological characters. The majority of these criteria systems were created under the guidance of N.I.Vavilov in the 20s and 30s and later developed and refined on the basis of new data from studies of plant diversity (Loskutov, 1999; 2000).
In view of the above, duplicates identification in the process of databases comparison has recently been receiving an increasing attention (van Hintum, 1994; van Hintum & Knupffer, 1995a; van Hintum & Visser, 1995b; van Hintum, 2000). At present, for solving this problem mostly the approaches that involve the analysis of databases alone, with no account of characteristics of the conserved seed material, are being employed. This problem is directly linked with the necessity to optimize and systematize the existing collections, and duplicates identification in collections held by genebanks is an aspect of special importance.
We believe that for solving the problem the botanical component is extremely important. It is supported by the fact that data on specific, and especially intraspecific classification are gaining significance not only for botanical research and breeding purposes, but also for genebanks seeking genetic purity of the maintained live ex situ seed collections.
According to Wesenberg (1992), 30 main (more than 200 accessions) world collections maintain about 94 500 accessions belonging to the genus Avena L. The European Avena Database contains, with complete data for the VIR collection, more than 27 500 entries (Bucken & Frese, 1998).
Rationalization of genebank collections is one of the most significant problems. This is very important to improve the efficiency of PGR conservation in our continent.
The search for duplicated material between germplasm collections of different countries is necessary, in the first place, for the identification of advanced cultivars which have the same name and are stored in different collections. The retrieval of duplicate accessions with the same identification numbers by the cultivar name supposes their morphological identity. It is intraspecific systematics that is used for establishing identity of such accessions.
Quite often, accessions of national collections were received from other genebanks. When comparing collections for possible duplication, it is desirable to have the most complete information about the origin of an accession, that is, original name of the accession, catalogue number from the donor genebank, catalogue numbers for this accession in other genebanks of the world, the place of origin and reproduction of the accession.
In the end of 1999, VIR, ZADI and NGB agreed on the implementation of a pilot project on the comparison of Avena collections conserved in Russia, Germany and Sweden. For identifying duplicates in the collections, we compare passport databases of four European Avena germplasm collections using the Corel Paradox 8 for Windows98. There are 11918 accessions in the VIR collection, 2921 in IPK, 1825 in BAZ and 478 in NGB. The total of over 17100 entries have been analysed. It equals 62% from the European Avena Database and 18% from the total Avena collections in the whole world.
All the abovementioned principles were used in our work for comparing all passport data between collections. For the analysis of passport data, a special structure for a joint database has been developed to facilitate accessions comparison by their names and additional information (Table 1 and 2).
This structure allowed the development of a compact database that ensured the comparative analysis of accessions in the mentioned collections.
For analysing and identifying duplicates, the most important fields are ACCNAME, DONORNUM, OTHERNUM, as well as the SCINAM field for determining identity in terms of classification units (i.e., genus, species, subspecies, etc.).
It was found that in the IPK database the fields DONORNUM (15%) and OTHERNUM (1%) are not sufficiently filled in, while the SCINAM field is populated well, but subtaxa is identified in only 88% of all cases, and 180 accessions have no species identification (i.e. Avena sp. only). The database at NGB featured all of the above mentioned drawbacks. All accessions are identified at the species level only, and without botanical varieties. In BAZ, the database has sufficiently complete DONORNUM (85%) and OTHERNUM (21%) fields, but the field SCINAM (40% for botanical varieties only) is very poor, confusing and quite often incorrect.
We found several duplicates within collections (in field ADD identified as D). Within the analyzed collections, 130 duplicates have been found: 70 accessions in the IPK collection, 54 in BAZ and only 6 in NGB (Table 3).
The analysis of passport data showed 464 accessions in the joint database to have some confusing information. These accessions require additional checking and consultations of genebank experts.
The conclusion from the analysis of 4 databases shows that 90% of the VIR database are unique accessions between four collection; the amount of these in the IPK database is 70%, and is equal to 50-60% in databases of BAZ and NGB. It is worth mentioned that quantity of unique accessions were following: VIR - 10770 accessions, IPK - 2080, BAZ - 1120 and NGB - 270.
As we mentioned previously this compact database is about 18% from International Avena Database. It was analysed quantity unique entries using DONORNUM and OTHERNUM fields, where were collected information about identification numbers from different genebanks and especially from American genebank. This analysis showed that percent of unique accessions compare between other genebanks was lower: VIR - 80%, IPK - 60%, NGB - 45% and BAZ - 25%.
From the analysis of databases, 1276 accessions have been found to have the same scientific name and be duplicated in 2 (1034 entries), 3 (207 entries), or sometimes in 4 (35 entries) genebanks at a time (Tables 5, 6 and 7).
It was found out that some collections are contaminated, some accessions are misidentified, but most often they are not comprehensively taxonomically determined. In some cases, the computerized botanical identification does not relate to the morphological characters of the stored accessions. It shows that some genebanks have not maintain original entries of seeds or herbarium seeds which could compare with regenerated seed accession and check contaminated material.
Our opinion is that some formal approach targeted at revealing similarly sounding names should be used by the crop expert with utmost care, as absolutely different cultivars may have similar or the same names.
Determination of genetic identity of advanced cultivars with the same parents is also problematic, as the contents of Pedigree fields in databases leaves much to be desired.
Talking of genetic identity of populations or landraces would be premature, as relevant information for these materials is very scanty in passport databases, or quite often is absent in most genebanks of the world.
All discussions about duplicates of gene alleles in any accessions, or about parental duplication are unfounded due to the extreme scarcity of information.
The use of various molecularbiological techniques for the identification of duplicates in collections is hardly reasonable, if these techniques are inefficient and costly, while presentday collections in the world number dozens and hundreds of thousands of accessions. For example, from more than 31 000 accessions of European Barley Database only 148 accessions of barley were analyzed using molecular marker technique (van Hintum & Visser, 1995b).
Our opinion that duplicates identification within and between germplasm collections of particular crops must be performed by an expert or collection curator specializing in this crop, as these are the people who can, thanks to their knowledge and experience, understand value and significance of a duplicate and come to a well-weighed decision in each particular case.
And so practical work with databases for unification of the collections using of taxonomic approaches inter-specific and intra-specific classification helps us better understand, properly evaluate and carefully maintain the global diversity of plant genetic recourses for future generations.