Cluster and Principal Component Analysis of Semi-Dwarf Tef [Eragrostis tef (Zucc.) Trotter] Recombinant Inbred Lines with Emphasis to Lodging

Tef is the main cereal crop widely produced and consumed in Ethiopia and preferred by millions of local smallholder farmers. It also gained recognition as a food crop in other parts of the world very recently due to its gluten-free grains and its nutritive value. Lodging is the major factor which greatly reduces both yields and quality of tef grain as well as the straw. The current study was conducted to group the lines as their similarities and assess the magnitude of genetic distances among them; then identify the contribution of individual traits for total variations. A total of 49 lines were evaluated for 16 traits using simple lattice design at Holetta and Debre Zeit in 2017 main rainy season. All the traits evaluated over the locations showed highly significant differences among the lines except fertile tiller per plant, while the lines x location interaction effect was highly significant for most of the traits evaluated. Cluster analysis grouped the lines into four clusters based on their similarity. The highest inter-cluster distance noted between clusters two and four while the lowest was between clusters one and four. Principal component analysis showed that about 77.6% of the gross variance among lines explained by five Principal components with eigenvalues greater than unity. This study revealed that four recombinant inbred lines had higher yield than local and standard checks. RIL# 14 showed highest grain yield, low lodging index and other desirable traits than all lines, which could base and strengthen future tef breeding if incorporated as plant material; especially for lodging problem.

Tef is the main cereal crop widely produced and consumed in Ethiopia and favored by millions of local smallholder farmers [7]. In terms of area of cultivation, it is the leading cereal crop followed by maize and wheat. According to the Central Statistical Agency [8], the area covered by tef during the 2019/2020 cropping season was over 3.1 million hectares or 30% of the total area occupied by cereals in the country.
Despite being a staple food for many people in Ethiopia for centuries, tef has gained prominence as a food crop in other parts of the world very recently. This interest is mainly associated with its gluten-free grains and its nutritive value that is generally comparable with other common cereals [9][10][11][12]. However, it is also growing as a pasture crop in several countries [13]. The straw from tef is a valuable source of livestock feed because it is more palatable and nutritious than that from wheat and barley [14].
Tef is a highly versatile crop with respect to adaptation to different agro-ecologies being widely grown from sea level up to 2800 m.a.s.l. with reasonable resilience to both drought and water logging [13]. The national average yield of tef is Trotter] Recombinant Inbred Lines with Emphasis to Lodging about 1.85 ton per hectare [8], but it has a potential of yielding four to five tons of grain per hectare if the lodging problem is resolved [15]. The major yield limiting factors are lack of cultivars that are tolerant to lodging and shortage of improved varieties [16].
Besides, the grains are also often lost in the harvesting and threshing process because of their minute size and traditional cultural practices [17]. Tef possesses tall, weak stems that easily succumb to lodging due to wind or rain. In addition, lodging hinders the use of high input husbandry practices since the application of increased amounts of nitrogen fertilizer to boost the yield results in severe lodging [16].
Mostly lodging greatly reduces both yields and quality of the grain as well as the straw. It is reported to decrease tef grain yield by approximately 15 to 45% depending on the weather condition and inherent nature of the variety used [18]; it also hampers both manual and mechanical harvesting [16]. Using lower seed rates and late sowing dates relatively decreases the problem of lodging. Although, various attempts have been made by the research community to develop lodging-resistant tef cultivars [13,19], no cultivar with reasonable lodging resistance has been obtained to-date except a novel tef mutant named kegne, and GA-10-3 which have a semi-dwarf phenotype, resulting in increased lodging tolerance [20].
The tef germplasm accessions showed wide genetic variability in phonological, morphological and agronomical traits [9,13,22]. In spite of this, there has been lack of sufficient variability in the tef germplasm for some valuable traits such as lodging and shattering resistance. Since recent past, a chemical mutagen, ethyl methane sulphonate (EMS), has been successfully utilized to induce semi-dwarf tef variants with lodging resistance as well as tolerance to aluminum toxicity and other acidity-related soil fertility problems [23][24][25]. The first semi-dwarf lodging-tolerant tef line, called kegne developed from an ethyl methane sulphonate-mutagenized population [20].
Some important works have also reported based on morphological, molecular and biochemical markers. According to Tareke [26], many efforts made in the past to implement different techniques and tools in order to improve tef. Some of them are such as inter-specific crossing that made between tef (Eragrostis tef) and Eragrostis curvula in an attempt to transfer the lodging tolerant trait of Eragrostis curvula to tef. However, so far, no viable hybrid obtained from the crosses. Some efforts also made to develop double haploids using gynogenesis technique and some promising tef lines were obtain [27]. The variations noted in panicle length (14-65 cm), culm length (11-82 cm), plant height (31-155 cm), culm thickness (1.2-4.5 mm) all indicate the potential for developing lodging-resistant genotypes through gene re-combination as suggested by [4].
Through many struggles made till now almost 51 improved varieties were released to the farming communities [28]. However, development of high yielding and lodging tolerant tef varieties, adapting to the changing climate remains to be the primary focus of tef research [29,30].
Especially, semi-dwarf tef types did not studied much yet and there is no lodging resistant tef [31]. Therefore, the current study conducted with the following objectives.
Objective: -1) To classify the lines based on their similarities and determine the level of genetic divergence among the clusters. 2) To identify major traits that contribute to the overall genetic variability among semi-dwarf tef lines to emphasize on these traits in further tef breeding.

Descriptions of Experimental Locations
The field experiment was carried out at two locations (Debre Zeit and Holetta) in the central parts of Ethiopia during the 2017 cropping season (July to December). Debre Zeit is located at 47 km to south east of Addis Ababa, while Holetta is located at 42 km to the west of Addis Ababa. DZARC found at (8° 44' N, 38° 58' E and 1860 m.a.s.l) whereas, HARC found at (9° 03' N, 38° 30' E and 2400 m.a.s.l) latitude, longitude and altitude, respectively. The two locations represent two different agro-ecologies of the country. Debre Zeit receives mean annual rainfall of 832 mm during the main growing season with maximum and minimum mean annual temperature of 24.3°C and 8.9°C, respectively. The experimental field at Debre Zeit characterized by heavy black soil (Vertisol) with a pH of 6.9 and described as very fine montmorillonitic typic pellustert with very high moisture retention capacity [32,33].
In contrast, Holeta often receives annual total rainfall 1100 mm with maximum and minimum mean annual temperature of 24.1°C and 6.6°C, respectively. The experimental field at this location characterize by light red soil (Andosol) with a pH of 6.3 and good moisture holding capacity. The weather conditions during the growing season were favorable and the experiment received sufficient amount of rainfall for normal growth of tef crop at each of the test locations.

Planting Materials
These experimental plant materials comprised 49 semidwarf tef recombinant inbred lines including local and standard checks. These included 45 recombinant inbred lines (RIL) derived from the crosses of DZ-01-192 x GA-10-3, the two parents (pure lines), one standard and local check ( Table  1).
The RILs are descendants of the intra-specific cross through continuous maintenance of progenies up to the seventh filial generation (F7) through selfing using F2derived single-seed-decent breeding method. The tef cultivar DZ-01-192 is late maturing, thick culmed, tall, has loose panicle and white seed color. GA-10-3 is a mutant line developed through mutation breeding by using Ethyl methane sulphonate (EMS) assisted by Targeted Induced Local Lesions IN Genomes (TILLING) method and introduced from university of Bern (Switzerland). It has lodging tolerance characters, early maturity, semi-dwarf structure and pale white seed color. The materials kindly supplied by Debre Zeit agricultural research center, in Ethiopia. I have duly acknowledged DZARC for their kindness.

Experimental Design, Layout and Management
The field experiments conducted using 7x7 simple lattice designs with two replications at both locations. Each plot (1 m x 1 m) consisted of five rows of 1 m length with an interrow spacing of 0.2 m. The distances are 1 m, both between plots and incomplete blocks and 1.5 m between replications. The tef recombinant inbred lines allotted to plots at random within each replication. Sowing was done on 13 August, 25 July 2017 at Debre Zeit and Holetta research center, respectively. As per the research recommendations, 15 kg/ha seed rate was used for both locations.
The fertilizer rate used for each location recommended depending on the type of soil. The fertilizers used for Holetta (light red soil) were 40kg N, 60kg P 2 O 5 , and 11kg S per hectare, as well as 60kg N, 60kg P 2 O 5 and 11 kg S per hectare for Debre Zeit (Vertisol). All NPS were applied at planting with a rate of 158 kg/ha and the remaining urea applied at the rate of 22 kg/ha for HARC and 65 kg /ha for DZARC. Half of the urea applied at sowing, while the remaining half applied at tillering. Hand weeding and other management practices were performed as required for both locations.

Data Collected
Data collected from sixteen quantitative traits including seven traits taken on plot basis and nine traits assessed on randomly taken five plants of tef from the central rows of each plot. For individual plant trait sampled, averages of data from the five random samples of plants per plot used for statistical analyses.
The following data taken from plot basis: Days to heading/ panicle emergence (DH): Number of days from seedling emergence to the appearance of the tips (about 5 cm) of the main shoot panicle on 50% of the plants in a plot. Note that tef panicle appears without showing the booting stage, which is unlike the other small cereals like wheat and barley, but similar to that in rice.
Days to maturity (DM): Number of days from seedling emergence to physiological maturity as judged by the change to straw color of the vegetative parts on 75% of the plants in the plot.
Grain filling period (GFP): This computed as the difference between the days to panicle emergence and that to maturity.
Above ground biomass yield (ABM): The total dry weight in kilogram of the above ground biomass per plot before threshing Grain yield (GY): The entire plot of grains weight in kilogram after threshing and sun drying.
Harvest index (HI): The ratio of grain yield to the total biomass in percent.
Lodging index (li): lodging assessment was performed as Lodging score (LS) was recorded on a 0-5 scale as the degree of leaning from the upright position and whereby zero=completely upright non-lodged plants and five=completely flat on the ground. The severity of lodging for each degree assessed as the proportion in percent of plants in a plot manifesting each degree of lodging. Finally, the lodging index for each plot was computed as the average of the product sum of each degree of lodging and the corresponding severity as indicated in the formula above.
The following observations recorded based on measurements made on five randomly taken and pre-tagged plants from the three central rows of each plots.

Cluster and Distance Analyses
Cluster analysis used to group genotypes into homogenous sets based on their response to the environments considered [33]. Hierarchical cluster analysis approach used to examine the assembling pattern of the 49-tef lines based on their similarity with respect to the corresponding means of all the 15 traits studied. A cluster analysis done to group the tested tef genotypes into genetically distinct classes using SAS Statistical Software Version 9.3 [35], following the average linkage cluster analysis. The numbers of clusters were determined based on the Pseudo-F and Pseudo-t 2 options resulted from SAS procedure of cluster data analysis. The dendrogram constructed based on the complete linkage and Euclidean distance used as a measure of dissimilarity.
Genetic distances between clusters as standardized were calculated using Mahalanobis's D 2 statistics [36] as D2ij = ( − ")′ $%& − 1( − "), where D 2 ij=the distance between cases i and j, x i and x j =vectors of the values of the variables for cases i and j and cov-1=the pooled within groups' variance-covariance matrix. The D 2 values come from pairs of clusters were considered as the calculated values of Chi-square (Χ 2 ) and tested for significance both at 1% and 5% probability levels against the tabulated value of Χ 2 for 'P' degree of freedom, where P is the number of traits considered [37].

Principal Component Analysis
Principal component analysis done using Minitab Statistical software, release 17 for windows (Minitab, 2007) to identify the traits that contributed to the large part of the total variation among the genotypes [38]. In principal component analysis, eigenvalues greater than one were considered important to explain the observed variability.

Cluster Analysis
Cluster analysis grouped the 49 semi-dwarf tef lines into four clusters based on their mean values and similarity by using SAS version 9.3 average linkage clustering methods (Figure 1). The number of clusters determined based on the pseudo-F and t 2 values, such that the pseudo-F reaches its pick and at the same time, it is larger than values before and after it in the list, while the pseudo t 2 is being at its minimum then followed by large numbers. This classified the test materials into four real clusters at about 75% level of similarity that able to classified further. The numbers of lines in each cluster varied from nineteen in cluster one; fifteen in cluster two, thirteen in cluster three and only two in the last cluster four ( Table 2). The different lines grouped with in each clusters assumed more closely related in terms of the studied traits than those lines grouped into different clusters.
Cluster four had higher mean values for days to heading, days to maturity, grain filling period, plant height, panicle length, culm length, second basal culm internode length, above ground biomass, grain yield and harvest index when compared to the other clusters. In contrast to this, cluster two consisted of lines, which had the lower values for traits such as days to maturity, grain filling period, plant height, culm length, peduncle length; second basal culm internode length, above ground biomass, grain yield and lodging index ( Table  2). Lines in cluster two were the earliest, the shortest in plant height, culm length, and second culm internode lengths and peduncle length and the least yielding ones in grain and biomass.
The current cluster analysis indicated that the variability presented in 49 semi-dwarf tef recombinant inbred lines were similar to earlier studies of Habte [39], who grouped 21 tef varieties and landraces into four clusters at about 60% similarity in that more in line with Temesgen [40] report showed four and six clusters based on 14 traits from 144 heterogeneous germplasm populations using data obtained at Holetta and Ginchi, respectively and that of three clusters reported by Costanza [41] using 39 accessions. It is also in agreement with Tadesse [17] which formed six major clusters from 35 cultivars, the others report also showed six main clusters at 75% similarity from 36 tef germplasm populations [21,42].

Inter-Cluster Distance Analysis
The highest inter cluster distance was measured between clusters two and four while the lowest one was measured between clusters one and three (Table 3). Genetic improvement through hybridization and selection depends on the extent of variability among the lines. Crossing for desirable traits can be successful between clusters with the highest and the lowest inter cluster distance.

Principal Component Analysis
In the principal component analysis (PCA), to estimate the relative contribution of traits towards the variation in the 49 tef lines, 77.6% explained by the first five PCs with eigenvalues greater than one out of the fifteen PCs employed for all the 15 traits. Therefore, five PCs retained to explain the observed variation without losing a substantial variability explained ( Table 4).
The first PC explained about 34%, the second 14%, and the third 11.7%, the fourth 10.9% and the fifth 6.9% of the variation. Plant height, culm length, above ground biomass and panicle length showed greater loadings on the first PC. Similarly, grain filling period, harvest index, lodging index and grain yield contributed in the second PC; while days to heading and number of spikelets per main panicle were have significant load in the third PC. In the fourth PC, days to maturity was the important trait, while in the fifth PC, number of spikelets per main panicle, lodging index and above ground biomass accounted for much of the observed gross variation.
The percentage contribution of the first five principal components to gross genetic variation obtained in the current study (77.6%) is different from Kebebew et al [43] 81% and Temsgen et al [44] 80.6%, while it is far greater than Kebebew et al [45] 71%. This indicates that the variation depends on the type of material used in the study. There was a sharp decline in contribution from PC1 to PC2 and then from PC2 to PC3 in that order while the rate of decrease in contribution became lower and lower for the remaining PCs. This shows that the first few principal components had the greatest contribution to the overall variation in the lines and for the 15 traits considered in this study.

Conclusion
The national average yield of tef is about 1.85 ton per hectare, but it has a potential of yielding four to five tons of grain per hectare if the lodging problem is resolved. Lodging substantially reduces the yields and quality of the grain as well as the straw. The variance analysis results were showed the presence of considerable variations among the 49-semidwarf tef lines almost for all the traits thereby suggesting higher chance of selecting lines for traits of interest. The results of analysis of variance allow carrying out further genetic analyses for all traits, except number of fertile tillers per plant, which was not significant.
Cluster analysis grouped the lines into four clusters based on their similarity. The highest inter-cluster distance occurred between clusters two and four while the lowest one was between clusters one and four. Principal components analysis showed that about 77.6% of the gross variance among lines laid in PC 1 to PC 5 and the total variance loaded largely by traits like plant height, panicle length and days to maturity.
To this end, the results revealed the existence of considerable variations for most traits of the test inbred lines, thus indicating the possibility of exploiting the variability in further tef breeding. Thus, recombinant inbred lines like RIL-14 have significantly low lodging index, longer panicle, higher number of spikelets per panicle, as well as the highest above ground biomass and grain yield. Genotypes identified with better grain yield related traits and reasonable lodging tolerance require further evaluation and eventual release to the farming communities in tef growing environments in Ethiopia.