Contents Preface III CHAPTER 1 PopulationsinGeneticStudies 1 1.1 CommonlyUsedPopulationsinGeneticStudies 2 1.1.1 Bi-ParentalPopulations 2 1.1.2 Multi-ParentalPopulations 5 1.1.3 Considerations in Developing Genetic Populations 9 1.2 PreliminaryAnalysisofGenotypicData 12 1.2.1 CollectionandCodingofGenotypicData 12 1.2.2 Gene Frequency and Genotypic Frequency 17 1.2.3 FitnessTestonGenotypicFrequencies 18 1.3 GeneticEffectandGeneticVariance 20 1.3.1 Calculation of Population Mean and Phenotypic Variance 20 1.3.2 One-Locus Additive and Dominance Model 23 1.3.3 Population Mean and Genetic Variance at One Locus 24 1.4 ANOVAonSingleEnvironmentTrials 27 1.4.1 Linear Decomposition on Phenotypic Observation 27 1.4.2 Decomposition of Sum of Squares of Phenotypic Deviations 28 1.4.3 Single Environmental ANOVA on Rice Grain Length 31 1.5 ANOVAonMulti-EnvironmentTrials 32 1.5.1 Linear Decomposition on Phenotypic Observation 32 1.5.2 Decomposition of Sum of Squares of Phenotypic Deviations 33 1.5.3 Multi-Environmental ANOVA on Rice Grain Length 38 1.6 Estimation of Genotypic Values and the Broad-Sense Heritability 39 1.6.1 Genotypic Values and Broad-Sense Heritability fromSingleEnvironmentalTrials 39 1.6.2 Genotypic Values and Broad-Sense Heritability fromMulti-EnvironmentalTrials 41 1.6.3 Estimation of Genotypic Values Under Heterogeneous Error Variances 42 Exercises 45 CHAPTER 2 Estimation of the Two-Point Recombination Frequencies 51 2.1 GenerationTransitionMatrix 51 2.1.1 Usefulness of the Transition Matrix in Linkage Analysis 51 2.1.2 Transition Matrix of One Generation of Backcrossing 53 2.1.3 Transition Matrix of One Generation of Selfing 55 2.1.4 TransitionMatrixofDoubledHaploid 58 2.1.5 TransitionMatrixofRepeatedSelfing 59 2.1.6 Expression of the Two-Locus Genotypic Frequencies inMatrixFormat 61 2.2 Theoretical Genotypic Frequencies at Two Loci 62 2.2.1 Theoretical Frequencies of 10 Genotypes at Two Loci 62 2.2.2 Theoretical Frequencies of 4 Homozygotes in Permanent Populations 65 2.2.3 Genotypic Frequencies of Two Co-Dominant Loci inTemporaryPopulations 65 2.2.4 Genotypic Frequencies of One Co-Dominant Locus and One Dominant Locus in Temporary Populations 69 2.2.5 Genotypic Frequencies of One Co-Dominant Locus and One Recessive Locus in Temporary Populations 69 2.2.6 Genotypic Frequencies of Two Dominant Loci in Temporary Populations 74 2.2.7 Genotypic Frequencies of One Dominant Locus and One Recessive Locus in Temporary Populations 74 2.2.8 Genotypic Frequencies of Two Recessive Loci in Temporary Populations 77 2.3 Estimation of Two-Point Recombination Frequency 77 2.3.1 Maximum Likelihood Estimation of Recombination FrequencyinDHPopulations 77 2.3.2 General Procedure on the Maximum Likelihood Estimation ofRecombinationFrequency 81 2.3.3 Estimation of Recombination Frequency Between One Co-Dominant and One Dominant Marker in F2 population 86 2.3.4 InitialValuesinNewtonAlgorithm 87 2.3.5 EM Algorithm in Estimating Recombination Frequency in F2 Populations 90 2.3.6 Effects on the Estimation of Recombination Frequency from SegregationDistortion 92 Exercises 95 CHAPTER 3 Three-Point Analysis and Linkage Map Construction 101 3.1 Three-Point Analysis and Mapping Function 102 3.1.1 Genetic Interference and Coefficient of Interference 102 3.1.2 Mapping Function 105 3.2 ConstructionofGeneticLinkageMaps 107 3.2.1 MarkerGroupingAlgorithm 107 3.2.2 MarkerOrderingAlgorithm 111 3.2.3 Use of the k-Optimal Algorithm in Linkage Map Construction 113 3.2.4 RipplingoftheOrderedMarkers 117 3.2.5 IntegrationofMultipleMaps 118 3.3 Comparison of the Recombination Frequency Estimation in Different Populations 121 3.3.1 LOD Score in Testing the Linkage Relationship in Different Populations 121 3.3.2 Accuracy of the Estimated Recombination Frequency 123 3.3.3 Least Population Size to Declare the Significant Linkage RelationshipandCloseLinkage 124 3.4 Linkage Analysis in Random Mating Populations 127 3.4.1 Linkage Dis-Equilibrium in Random Mating Populations 127 3.4.2 Generation Transition Matrix from Diploid Genotypes toHaploidGametes 130 3.4.3 Gametic and Genotypic Frequencies in Populations After SeveralGenerationsofRandomMating 132 Exercises 134 CHAPTER 4 Single Marker Analysis and Simple Interval Mapping 139 4.1 SingleMarkerAnalysis 140 4.1.1 Phenotypic Means of Different Genotypes at One Marker Locus 140 4.1.2 Single Marker Analysis by t-Test in Populations with Two Genotypes 143 4.1.3 Single Marker Analysis by t-Test in Populations with Three Genotypes 146 4.1.4 ANOVA in Single Marker Analysis in Populations with Three Genotypes 150 4.1.5 Likelihood Ratio Test in Single Marker Analysis 151 4.1.6 ProblemswithSingleMarkerAnalysis 153 4.2 SimpleIntervalMapping 154 4.2.1 Frequencies of the QTL Genotypes in a Marker Interval 154 4.2.2 Maximum Likelihood Estimation of Phenotypic Means ofQTLGenotypes 161 4.2.3 TestingfortheExistenceofQTL 166 4.2.4 Estimation of Genetic Effects of QTL and Its Contribution toPhenotypicVariance 167 4.2.5 Applications of Simple Interval Mapping in DH and F2 Populations 168 4.2.6 Phenomenon of ‘Ghost’ QTL in Simple Interval Mapping 171 4.2.7 Other Problems with Simple Interval Mapping 172 4.3 ThresholdValuesofLODScoreinQTLMapping 174 4.3.1 Significance Level and Critical Value of One Test Statistic 174 4.3.2 Distribution of the LRT Statistic at Single Scanning Positions intheAbsenceofAnyQTL 176 4.3.3 Factors Affecting the Distribution of the Genome-Wide LargestLODScore 177 4.3.4 Number of Effective Tests and the Empirical LOD Score ThresholdsinQTLMapping 180 4.3.5 Permutation Test and the Empirical LOD Score Thresholds inQTLMapping 184 Exercises 189 CHAPTER 5 InclusiveCompositeIntervalMapping 195 5.1 Importance of the Control on Background Genetic Variation in QTL Mapping 196 5.2 Inclusive Composite Interval Mapping in DH Populations 199 5.2.1 AdditiveGeneticModelofOneSingleQTL 199 5.2.2 Additive Genetic Model for Multiple QTLs 201 5.2.3 One-Dimensional Scanning and Hypothesis Testing forAdditiveQTLs 202 5.2.4 Application of ICIM in a DH Mapping Population in Barley 204 5.3 Inclusive Composite Interval Mapping in F2 Populations 208 5.3.1 Additive and Dominant Model of One Single QTL 208 5.3.2 Additive and Dominant Model for Multiple QTLs 212 5.3.3 One-Dimensional Scanning and Hypothesis Testing inAdditiveandDominantQTLMapping 213 5.3.4 Application of ICIM in an F2 Mapping Population 214 5.4 Type II Error in Hypothesis Testing and Statistical Power in QTL Detection 216 5.4.1 Type II Error and Statistical Power in Hypothesis Testing 216 5.4.2 Probability of Two Types of Error and the Appropriate SampleSize 220 5.4.3 Distribution and Effect Models of QTLs Used in Power AnalysisbySimulations 222 5.4.4 Calculation of the Detection Power and False Discovery Rate inQTLMapping 224 5.5 ComparisonofIMandICIMbySimulation 230 5.5.1 QTLDetectionPowerandFDRfromIM 230 5.5.2 QTLDetectionPowerandFDRfromICIM 232 5.5.3 Detection Powers Counted by Marker Intervals 234 5.5.4 Suitable Population Size Required in QTL Mapping 235 5.6 Avoiding the Overfitting Problem in the First Step of Model SelectioninICIM 237 Exercises 240 CHAPTER 6 QTL Mapping for Epistasis and Genotype-by-Environment Interaction 245 6.1 EpistaticQTLMappinginDHPopulations 246 6.1.1 Linear Regression in Epistatic QTL Mapping and the StatisticalProperties 246 6.1.2 Two-Dimensional Scanning on Di-Genic Epistatic QTLs 248 6.1.3 Genetic Variance on Epistatic QTLs with Linkage 253 6.1.4 Simulation Study on Epistatic QTL Mapping in DH Populations 254 6.2 EpistaticQTLMappinginF2Populations 257 6.2.1 The Di-Genic Epistasis Model in F2 Populations 257 6.2.2 Epistatic QTL Mapping Procedure in F2 Population 258 6.2.3 Detection Power of Epistatic QTLs in F2 Populations 265 6.3 Genetic Analysis and Detection Power of the Most Common Di-GenicInteractions 268 6.3.1 Genetic Effects in Di-Genic Interactions 268 6.3.2 Decomposition of Genetic Variance at the Presence ofDi-GenicEpistasis 270 6.3.3 Power Simulation of Epistatic QTL Mapping 276 6.3.4 IssuesinEpistaticQTLMapping 281 6.4 Mapping of the QTL by Environment Interactions 282 6.4.1 Mapping of the Additive QTL by Environment Interactions 282 6.4.2 Mapping of the Epistatic QTL and Environment Interactions 285 6.4.3 QTL and Environment Interactions in One Actual RIL PopulationinMaize 287 Exercises 291 CHAPTER 7 Genetic Analysis in Hybrid F1 of Two Heterozygous Parents andDouble-CrossF1ofFourHomozygousParents 295 7.1 Linage Analysis in the Hybrid F1 Derived from Two Heterozygous Parents 296 7.1.1 CategoriesofPolymorphismMarkers 296 7.1.2 Unknown Linkage Phases in Heterozygous Parents and Genotypes in Their F1 Progenies at Two Loci 298 7.1.3 Estimation of the Recombination Frequency Between Two Fully-InformativeMarkers 299 7.1.4 Haploid Type Rebuilding in the Heterozygous Parents 302 7.2 Estimation of the Recombination Frequency for Incompletely InformativeMarkers 305 7.2.1 Theoretical Frequencies of Identifiable Genotypes Between the Complete Marker and Other Three Categories of Markers 306 7.2.2 Theoretical Frequencies of Identifiable Genotypes Between Two Markers Belonging to Category II, III, or IV 308 7.2.3 Theoretical Frequencies of Identifiable Genotypes Between TwoCategoryIVMarkers 311 7.2.4 Haploid Type Rebuilding at the Presence of All Categories ofMarkers 316 7.3 Linkage Analysis in Double Cross F1 Derived from Four Pure-Line Parents 319 7.3.1 Marker Categories and Estimation of Recombination Frequency in the Double Cross F1 Population 319 7.3.2 Equivalence Between the Double Cross F1 of Pure-Line Parents and Hybrid F1 of Heterozygous Parents 323 7.3.3 Genotypic Frequencies at Three Complete Markers 325 7.3.4 Imputation of Incomplete and Missing Marker Information 327 7.4 QTL Mapping in the Double Cross F1 Population Derived from Four Pure-LineParents 331 7.4.1 One-QTL Genetic Model in Double Cross F1 Population 332 7.4.2 The Linear Regression Model of the Phenotype on Marker TypeforMultipleQTLs 335 7.4.3 Inclusive Composite Interval Mapping (ICIM) in the Double CrossF1Population 336 Exercises 338 CHAPTER 8 Genetic Analysis in Multi-Parental Pure-Line Progeny Populations 345 8.1 Linkage Analysis in Four-Parental Pure-Line Populations 346 8.1.1 Development Procedure and Marker Classification inFour-ParentalPure-LinePopulations 346 8.1.2 Theoretical Frequencies of Genotypes and Estimation of Recombination Frequency at Two Complete Loci 348 8.1.3 Estimation of the Recombination Frequency Involving IncompleteMarkers 355 8.1.4 Situations When Number of Inbred Parents Smaller Than Four 359 8.2 Linkage Analysis in Eight-Parental Pure-Line Populations 360 8.2.1 Development Procedure and Marker Classification in Eight-Parental Pure-Line Populations 360 8.2.2 Marker Classification and Genotypic Coding in Eight-Parental Pure-Line Populations 362 8.2.3 Theoretical Frequencies of Genotypes at Two Complete Loci 363 8.2.4 Estimation of the Recombination Frequency Between Any TwoCategoriesofMarkers 368 8.2.5 Situations When the Number of Inbred Parents Smaller Than Eight 370 8.3 QTL Mapping in Four-Parental Pure-Line Populations 370 8.3.1 Genetic Constitution at Three Complete Loci 371 8.3.2 Imputation of the Incomplete and Missing Marker Information 372 8.3.3 The Linear Regression Model of Phenotype on Marker Types 378 8.3.4 Inclusive Composite Interval Mapping (ICIM) inFour-ParentalPure-LinePopulations 380 8.4 QTL Mapping in Eight-Parental Pure-Line Populations 383 8.4.1 Genetic Constitution at Three Complete Loci 383 8.4.2 The Linear Regression Model of Phenotype on Marker Types 389 8.4.3 Inclusive Composite Interval Mapping (ICIM) in Eight-Parental Pure-Line Populations 391 Exercises 394 CHAPTER 9 QTLMappinginOtherGeneticPopulations 399 9.1 Selective Genotyping Analysis and Bulked Segregant Analysis 400 9.1.1 Statistical Principles of Selective Genotyping Analysis 400 9.1.2 Likelihood Ratio Test and LOD Score Statistics from Selective GenotypingAnalysis 402 9.1.3 BulkedSegregantAnalysis 403 9.1.4 Problems with Selective Genotyping Analysis and Bulked SegregantAnalysis 404 9.2 QTL Mapping in Populations of Chromosomal Segment Substitution Lines 404 9.2.1 Characteristics of Chromosomal Segment Substitution Lines 404 9.2.2 Mapping Methods in Populations of Chromosomal Segment SubstitutionLines 407 9.2.3 QTL Mapping for Grain Length in a CSSL Population inRice 412 9.3 QTL Mapping in Genetic Populations of Multiple Parents Crossed withOneCommonParent 414 9.3.1 Generalized Linear Regression and Model Selection 415 9.3.2 Parameter Estimation and Hypothesis Testing in JICIM 415 9.3.3 QTL Mapping for Flowering Time in an Arabidopsis NAM Population 417 9.4 MendelizationofQuantitativeTraitGenes 419 9.4.1 Preliminary Mapping of One QTL on Grain Width of Rice in OneRILPopulation 420 9.4.2 Validation of the Grain Width QTL by Chromosomal SegmentSubstitutionLines 421 9.4.3 Mendelization of a Stable QTL on Grain Width 424 9.4.4 Fine Mapping and Functional Analysis of the Gene at a Stable GrainWidthQTL 425 9.5 AssociationMappinginNaturalPopulations 427 9.5.1 Linkage Disequilibrium is the Prerequisite of Gene Mapping 427 9.5.2 Linkage Disequilibrium in Random Mating Populations 429 9.5.3 Factors Influencing Linkage Disequilibrium 432 9.5.4 Comparison of Linkage and Association Approaches in Gene Mapping 435 Exercises 439 CHAPTER 10 More on the Frequently Asked Questions in QTL Mapping 443 10.1 Genetic Variance and Contribution to Phenotypic Variation of the DetectedQTL 443 10.1.1 Genetic Variance and Phenotypic Contribution from One QTL 443 10.1.2 Genetic Variance and Phenotypic Contribution of Linked QTLs 445 10.1.3 Phenotypic Contribution and the QTL Detection Power 448 10.2 OntheUseofCompositeTraitsinQTLMapping 450 10.2.1 Composite Traits and Their Applications in Genetic StudiesandBreeding 450 10.2.2 QTL Mapping on Component and Composite Traits in One MaizeRILPopulation 451 10.2.3 Genetic Effects and Genetic Variances on Composite Traits 455 10.2.4 Power Analysis in QTL Mapping on Composite Traits 461 10.2.5 HeritabilityofCompositeTraits 465 10.3 Effects on QTL Detection by the Increase in Marker Density 470 10.3.1 Effects of Denser Markers on Independent QTLs 470 10.3.2 EffectofDenserMarkersonLinkedQTLs 471 10.4 Imputation of Missing Marker Types and Their Effects in QTL MappinginBi-ParentalPopulations 474 10.4.1 Imputation of Missing and Incomplete Marker Types 474 10.4.2 QTLs on Plant Height in an F2 Population in Rice 477 10.4.3 Effects of Missing Marker Types on QTL Detection 479 10.5 Effects of Segregation Distortion on Genetic Studies 481 10.5.1 Segregation Distortion Loci in One Rice F2 Population 481 10.5.2 Effects of Segregation Distortion on QTL Mapping in Populations with Three Genotypes at Each Locus 482 10.5.3 Genetic Distance That can be Affected by Segregation Distortion 486 10.5.4 Effects of Segregation Distortion on QTL Mapping in Populations with Two Genotypes at Each Locus 487 10.6 Non-Normality of the Phenotypic Distribution 488 10.6.1 Phenotypic Model and Distribution of Quantitative Traits 488 10.6.2 QTL Mapping on Phenotypic Traits of the Non-Normal Distributions 489 Exercises 492 References 495 Index 503 AppendixA:JournalArticlesMakingUpThisBook 509 Appendix B: Dissertations of Post-Graduates Making Up This Book 513 Appendix C: Integrated Software Packages Making Up This Book 515