• Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • QuestionPro

survey software icon

  • Solutions Industries Gaming Automotive Sports and events Education Government Travel & Hospitality Financial Services Healthcare Cannabis Technology Use Case AskWhy Communities Audience Contactless surveys Mobile LivePolls Member Experience GDPR Positive People Science 360 Feedback Surveys
  • Resources Blog eBooks Survey Templates Case Studies Training Help center

data analysis in experimental research

Home Market Research

Data Analysis in Research: Types & Methods

data-analysis-in-research

Content Index

Why analyze data in research?

Types of data in research, finding patterns in the qualitative data, methods used for data analysis in qualitative research, preparing data for analysis, methods used for data analysis in quantitative research, considerations in research data analysis, what is data analysis in research.

Definition of research in data analysis: According to LeCompte and Schensul, research data analysis is a process used by researchers to reduce data to a story and interpret it to derive insights. The data analysis process helps reduce a large chunk of data into smaller fragments, which makes sense. 

Three essential things occur during the data analysis process — the first is data organization . Summarization and categorization together contribute to becoming the second known method used for data reduction. It helps find patterns and themes in the data for easy identification and linking. The third and last way is data analysis – researchers do it in both top-down and bottom-up fashion.

LEARN ABOUT: Research Process Steps

On the other hand, Marshall and Rossman describe data analysis as a messy, ambiguous, and time-consuming but creative and fascinating process through which a mass of collected data is brought to order, structure and meaning.

We can say that “the data analysis and data interpretation is a process representing the application of deductive and inductive logic to the research and data analysis.”

Researchers rely heavily on data as they have a story to tell or research problems to solve. It starts with a question, and data is nothing but an answer to that question. But, what if there is no question to ask? Well! It is possible to explore data even without a problem – we call it ‘Data Mining’, which often reveals some interesting patterns within the data that are worth exploring.

Irrelevant to the type of data researchers explore, their mission and audiences’ vision guide them to find the patterns to shape the story they want to tell. One of the essential things expected from researchers while analyzing data is to stay open and remain unbiased toward unexpected patterns, expressions, and results. Remember, sometimes, data analysis tells the most unforeseen yet exciting stories that were not expected when initiating data analysis. Therefore, rely on the data you have at hand and enjoy the journey of exploratory research. 

Create a Free Account

Every kind of data has a rare quality of describing things after assigning a specific value to it. For analysis, you need to organize these values, processed and presented in a given context, to make it useful. Data can be in different forms; here are the primary data types.

  • Qualitative data: When the data presented has words and descriptions, then we call it qualitative data . Although you can observe this data, it is subjective and harder to analyze data in research, especially for comparison. Example: Quality data represents everything describing taste, experience, texture, or an opinion that is considered quality data. This type of data is usually collected through focus groups, personal qualitative interviews , qualitative observation or using open-ended questions in surveys.
  • Quantitative data: Any data expressed in numbers of numerical figures are called quantitative data . This type of data can be distinguished into categories, grouped, measured, calculated, or ranked. Example: questions such as age, rank, cost, length, weight, scores, etc. everything comes under this type of data. You can present such data in graphical format, charts, or apply statistical analysis methods to this data. The (Outcomes Measurement Systems) OMS questionnaires in surveys are a significant source of collecting numeric data.
  • Categorical data: It is data presented in groups. However, an item included in the categorical data cannot belong to more than one group. Example: A person responding to a survey by telling his living style, marital status, smoking habit, or drinking habit comes under the categorical data. A chi-square test is a standard method used to analyze this data.

Learn More : Examples of Qualitative Data in Education

Data analysis in qualitative research

Data analysis and qualitative data research work a little differently from the numerical data as the quality data is made up of words, descriptions, images, objects, and sometimes symbols. Getting insight from such complicated information is a complicated process. Hence it is typically used for exploratory research and data analysis .

Although there are several ways to find patterns in the textual information, a word-based method is the most relied and widely used global technique for research and data analysis. Notably, the data analysis process in qualitative research is manual. Here the researchers usually read the available data and find repetitive or commonly used words. 

For example, while studying data collected from African countries to understand the most pressing issues people face, researchers might find  “food”  and  “hunger” are the most commonly used words and will highlight them for further analysis.

LEARN ABOUT: Level of Analysis

The keyword context is another widely used word-based technique. In this method, the researcher tries to understand the concept by analyzing the context in which the participants use a particular keyword.  

For example , researchers conducting research and data analysis for studying the concept of ‘diabetes’ amongst respondents might analyze the context of when and how the respondent has used or referred to the word ‘diabetes.’

The scrutiny-based technique is also one of the highly recommended  text analysis  methods used to identify a quality data pattern. Compare and contrast is the widely used method under this technique to differentiate how a specific text is similar or different from each other. 

For example: To find out the “importance of resident doctor in a company,” the collected data is divided into people who think it is necessary to hire a resident doctor and those who think it is unnecessary. Compare and contrast is the best method that can be used to analyze the polls having single-answer questions types .

Metaphors can be used to reduce the data pile and find patterns in it so that it becomes easier to connect data with theory.

Variable Partitioning is another technique used to split variables so that researchers can find more coherent descriptions and explanations from the enormous data.

LEARN ABOUT: Qualitative Research Questions and Questionnaires

There are several techniques to analyze the data in qualitative research, but here are some commonly used methods,

  • Content Analysis:  It is widely accepted and the most frequently employed technique for data analysis in research methodology. It can be used to analyze the documented information from text, images, and sometimes from the physical items. It depends on the research questions to predict when and where to use this method.
  • Narrative Analysis: This method is used to analyze content gathered from various sources such as personal interviews, field observation, and  surveys . The majority of times, stories, or opinions shared by people are focused on finding answers to the research questions.
  • Discourse Analysis:  Similar to narrative analysis, discourse analysis is used to analyze the interactions with people. Nevertheless, this particular method considers the social context under which or within which the communication between the researcher and respondent takes place. In addition to that, discourse analysis also focuses on the lifestyle and day-to-day environment while deriving any conclusion.
  • Grounded Theory:  When you want to explain why a particular phenomenon happened, then using grounded theory for analyzing quality data is the best resort. Grounded theory is applied to study data about the host of similar cases occurring in different settings. When researchers are using this method, they might alter explanations or produce new ones until they arrive at some conclusion.

LEARN ABOUT: 12 Best Tools for Researchers

Data analysis in quantitative research

The first stage in research and data analysis is to make it for the analysis so that the nominal data can be converted into something meaningful. Data preparation consists of the below phases.

Phase I: Data Validation

Data validation is done to understand if the collected data sample is per the pre-set standards, or it is a biased data sample again divided into four different stages

  • Fraud: To ensure an actual human being records each response to the survey or the questionnaire
  • Screening: To make sure each participant or respondent is selected or chosen in compliance with the research criteria
  • Procedure: To ensure ethical standards were maintained while collecting the data sample
  • Completeness: To ensure that the respondent has answered all the questions in an online survey. Else, the interviewer had asked all the questions devised in the questionnaire.

Phase II: Data Editing

More often, an extensive research data sample comes loaded with errors. Respondents sometimes fill in some fields incorrectly or sometimes skip them accidentally. Data editing is a process wherein the researchers have to confirm that the provided data is free of such errors. They need to conduct necessary checks and outlier checks to edit the raw edit and make it ready for analysis.

Phase III: Data Coding

Out of all three, this is the most critical phase of data preparation associated with grouping and assigning values to the survey responses . If a survey is completed with a 1000 sample size, the researcher will create an age bracket to distinguish the respondents based on their age. Thus, it becomes easier to analyze small data buckets rather than deal with the massive data pile.

LEARN ABOUT: Steps in Qualitative Research

After the data is prepared for analysis, researchers are open to using different research and data analysis methods to derive meaningful insights. For sure, statistical analysis plans are the most favored to analyze numerical data. In statistical analysis, distinguishing between categorical data and numerical data is essential, as categorical data involves distinct categories or labels, while numerical data consists of measurable quantities. The method is again classified into two groups. First, ‘Descriptive Statistics’ used to describe data. Second, ‘Inferential statistics’ that helps in comparing the data .

Descriptive statistics

This method is used to describe the basic features of versatile types of data in research. It presents the data in such a meaningful way that pattern in the data starts making sense. Nevertheless, the descriptive analysis does not go beyond making conclusions. The conclusions are again based on the hypothesis researchers have formulated so far. Here are a few major types of descriptive analysis methods.

Measures of Frequency

  • Count, Percent, Frequency
  • It is used to denote home often a particular event occurs.
  • Researchers use it when they want to showcase how often a response is given.

Measures of Central Tendency

  • Mean, Median, Mode
  • The method is widely used to demonstrate distribution by various points.
  • Researchers use this method when they want to showcase the most commonly or averagely indicated response.

Measures of Dispersion or Variation

  • Range, Variance, Standard deviation
  • Here the field equals high/low points.
  • Variance standard deviation = difference between the observed score and mean
  • It is used to identify the spread of scores by stating intervals.
  • Researchers use this method to showcase data spread out. It helps them identify the depth until which the data is spread out that it directly affects the mean.

Measures of Position

  • Percentile ranks, Quartile ranks
  • It relies on standardized scores helping researchers to identify the relationship between different scores.
  • It is often used when researchers want to compare scores with the average count.

For quantitative research use of descriptive analysis often give absolute numbers, but the in-depth analysis is never sufficient to demonstrate the rationale behind those numbers. Nevertheless, it is necessary to think of the best method for research and data analysis suiting your survey questionnaire and what story researchers want to tell. For example, the mean is the best way to demonstrate the students’ average scores in schools. It is better to rely on the descriptive statistics when the researchers intend to keep the research or outcome limited to the provided  sample  without generalizing it. For example, when you want to compare average voting done in two different cities, differential statistics are enough.

Descriptive analysis is also called a ‘univariate analysis’ since it is commonly used to analyze a single variable.

Inferential statistics

Inferential statistics are used to make predictions about a larger population after research and data analysis of the representing population’s collected sample. For example, you can ask some odd 100 audiences at a movie theater if they like the movie they are watching. Researchers then use inferential statistics on the collected  sample  to reason that about 80-90% of people like the movie. 

Here are two significant areas of inferential statistics.

  • Estimating parameters: It takes statistics from the sample research data and demonstrates something about the population parameter.
  • Hypothesis test: I t’s about sampling research data to answer the survey research questions. For example, researchers might be interested to understand if the new shade of lipstick recently launched is good or not, or if the multivitamin capsules help children to perform better at games.

These are sophisticated analysis methods used to showcase the relationship between different variables instead of describing a single variable. It is often used when researchers want something beyond absolute numbers to understand the relationship between variables.

Here are some of the commonly used methods for data analysis in research.

  • Correlation: When researchers are not conducting experimental research or quasi-experimental research wherein the researchers are interested to understand the relationship between two or more variables, they opt for correlational research methods.
  • Cross-tabulation: Also called contingency tables,  cross-tabulation  is used to analyze the relationship between multiple variables.  Suppose provided data has age and gender categories presented in rows and columns. A two-dimensional cross-tabulation helps for seamless data analysis and research by showing the number of males and females in each age category.
  • Regression analysis: For understanding the strong relationship between two variables, researchers do not look beyond the primary and commonly used regression analysis method, which is also a type of predictive analysis used. In this method, you have an essential factor called the dependent variable. You also have multiple independent variables in regression analysis. You undertake efforts to find out the impact of independent variables on the dependent variable. The values of both independent and dependent variables are assumed as being ascertained in an error-free random manner.
  • Frequency tables: The statistical procedure is used for testing the degree to which two or more vary or differ in an experiment. A considerable degree of variation means research findings were significant. In many contexts, ANOVA testing and variance analysis are similar.
  • Analysis of variance: The statistical procedure is used for testing the degree to which two or more vary or differ in an experiment. A considerable degree of variation means research findings were significant. In many contexts, ANOVA testing and variance analysis are similar.
  • Researchers must have the necessary research skills to analyze and manipulation the data , Getting trained to demonstrate a high standard of research practice. Ideally, researchers must possess more than a basic understanding of the rationale of selecting one statistical method over the other to obtain better data insights.
  • Usually, research and data analytics projects differ by scientific discipline; therefore, getting statistical advice at the beginning of analysis helps design a survey questionnaire, select data collection methods , and choose samples.

LEARN ABOUT: Best Data Collection Tools

  • The primary aim of data research and analysis is to derive ultimate insights that are unbiased. Any mistake in or keeping a biased mind to collect data, selecting an analysis method, or choosing  audience  sample il to draw a biased inference.
  • Irrelevant to the sophistication used in research data and analysis is enough to rectify the poorly defined objective outcome measurements. It does not matter if the design is at fault or intentions are not clear, but lack of clarity might mislead readers, so avoid the practice.
  • The motive behind data analysis in research is to present accurate and reliable data. As far as possible, avoid statistical errors, and find a way to deal with everyday challenges like outliers, missing data, data altering, data mining , or developing graphical representation.

LEARN MORE: Descriptive Research vs Correlational Research The sheer amount of data generated daily is frightening. Especially when data analysis has taken center stage. in 2018. In last year, the total data supply amounted to 2.8 trillion gigabytes. Hence, it is clear that the enterprises willing to survive in the hypercompetitive world must possess an excellent capability to analyze complex research data, derive actionable insights, and adapt to the new market needs.

LEARN ABOUT: Average Order Value

QuestionPro is an online survey platform that empowers organizations in data analysis and research and provides them a medium to collect data by creating appealing surveys.

MORE LIKE THIS

target population

Target Population: What It Is + Strategies for Targeting

Aug 29, 2024

Microsoft Customer Voice vs QuestionPro: Choosing the Best

statistical methods

Statistical Methods: What It Is, Process, Analyze & Present

Aug 28, 2024

data analysis in experimental research

Velodu and QuestionPro: Connecting Data with a Human Touch

Other categories.

  • Academic Research
  • Artificial Intelligence
  • Assessments
  • Brand Awareness
  • Case Studies
  • Communities
  • Consumer Insights
  • Customer effort score
  • Customer Engagement
  • Customer Experience
  • Customer Loyalty
  • Customer Research
  • Customer Satisfaction
  • Employee Benefits
  • Employee Engagement
  • Employee Retention
  • Friday Five
  • General Data Protection Regulation
  • Insights Hub
  • Life@QuestionPro
  • Market Research
  • Mobile diaries
  • Mobile Surveys
  • New Features
  • Online Communities
  • Question Types
  • Questionnaire
  • QuestionPro Products
  • Release Notes
  • Research Tools and Apps
  • Revenue at Risk
  • Survey Templates
  • Training Tips
  • Tuesday CX Thoughts (TCXT)
  • Uncategorized
  • What’s Coming Up
  • Workforce Intelligence

National Academies Press: OpenBook

Effective Experiment Design and Data Analysis in Transportation Research (2012)

Chapter: chapter 3 - examples of effective experiment design and data analysis in transportation research.

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

10 Examples of Effective Experiment Design and Data Analysis in Transportation Research About this Chapter This chapter provides a wide variety of examples of research questions. The examples demon- strate varying levels of detail with regard to experiment designs and the statistical analyses required. The number and types of examples were selected after consulting with many practitioners. The attempt was made to provide a couple of detailed examples in each of several areas of transporta- tion practice. For each type of problem or analysis, some comments also appear about research topics in other areas that might be addressed using the same approach. Questions that were briefly introduced in Chapter 2 are addressed in considerably more depth in the context of these examples. All the examples are organized and presented using the outline below. Where applicable, ref- erences to the two-volume primer produced under NCHRP Project 20-45 have been provided to encourage the reader to obtain more detail about calculation techniques and more technical discussion of issues. Basic Outline for Examples The numbered outline below is the model for the structure of all of the examples that follow. 1. Research Question/Problem Statement: A simple statement of the research question is given. For example, in the maintenance category, does crack sealant A perform better than crack sealant B? 2. Identification and Description of Variables: The dependent and independent variables are identified and described. The latter includes an indication of whether, for example, the variables are discrete or continuous. 3. Data Collection: A hypothetical scenario is presented to describe how, where, and when data should be collected. As appropriate, reference is made to conventions or requirements for some types of data (e.g., if delay times at an intersection are being calculated before and after some treatment, the data collected need to be consistent with the requirements in the Highway Capacity Manual). Typical problems are addressed, such as sample size, the need for control groups, and so forth. 4. Specification of Analysis Technique and Data Analysis: The links between successfully framing the research question, fully describing the variables that need to be considered, and the specification of the appropriate analysis technique are highlighted in each example. Refer- ences to NCHRP Project 20-45 are provided for additional detail. The appropriate types of statistical test(s) are described for the specific example. 5. Interpreting the Results: In each example, results that can be expected from the analysis are discussed in terms of what they mean from a statistical perspective (e.g., the t-test result from C h a p t e r 3

examples of effective experiment Design and Data analysis in transportation research 11 a comparison of means indicates whether the mean values of two distributions can be con- sidered to be equal with a specified degree of confidence) as well as an operational perspective (e.g., judging whether the difference is large enough to make an operational difference). In each example, the typical results and their limitations are discussed. 6. Conclusion and Discussion: This section recaps how the early steps in the process lead directly to the later ones. Comments are made regarding how changes in the early steps can affect not only the results of the analysis but also the appropriateness of the approach. 7. Applications in Other Areas of Transportation Research: Each example includes a short list of typical applications in other areas of transportation research for which the approach or analysis technique would be appropriate. Techniques Covered in the Examples The determination of what kinds of statistical techniques to include in the examples was made after consulting with a variety of professionals and examining responses to a survey of research- oriented practitioners. The examples are not exhaustive insofar as not every type of statistical analysis is covered. However, the attempt has been made to cover a representative sample of tech- niques that the practitioner is most likely to encounter in undertaking or supervising research- oriented projects. The following techniques are introduced in one or more examples: • Descriptive statistics • Fitting distributions/goodness of fit (used in one example) • Simple one- and two-sample comparison of means • Simple comparisons of multiple means using analysis of variance (ANOVA) • Factorial designs (also ANOVA) • Simple comparisons of means before and after some treatment • Complex before-and-after comparisons involving control groups • Trend analysis • Regression • Logit analysis (used in one example) • Survey design and analysis • Simulation • Non-parametric methods (used in one example) Although the attempt has been made to make the examples as readable as possible, some tech- nical terms may be unfamiliar to some readers. Detailed definitions for most applicable statistical terms are available in the glossary in NCHRP Project 20-45, Volume 2, Appendix A. Most defini- tions used here are consistent with those contained in NCHRP Project 20-45, which contains useful information for everyone from the beginning researcher to the most accomplished statistician. Some variations appear in the notations used in the examples. For example, in statistical analy- sis an alternate hypothesis may be represented by Ha or by H1, and readers will find both notations used in this report. The examples were developed by several authors with differing backgrounds, and latitude was deliberately given to the authors to use the notations with which they are most familiar. The variations have been included purposefully to acquaint readers with the fact that the same concepts (e.g., something as simple as a mean value) may be noted in various ways by different authors or analysts. Finally, the more widely used techniques, such as analysis of variance (ANOVA), are applied in more than one example. Readers interested in ANOVA are encouraged to read all the ANOVA examples as each example presents different aspects of or perspectives on the approach, and computational techniques presented in one example may not be repeated in later examples (although a citation typically is provided).

12 effective experiment Design and Data analysis in transportation research Areas Covered in the Examples Transportation research is very broad, encompassing many fields. Based on consultation with many research-oriented professionals and a survey of practitioners, key areas of research were identified. Although these areas have lots of overlap, explicit examples in the following areas are included: • Construction • Environment • Lab testing and instrumentation • Maintenance • Materials • Pavements • Public transportation • Structures/bridges • Traffic operations • Traffic safety • Transportation planning • Work zones The 21 examples provided on the following pages begin with the most straightforward ana- lytical approaches (i.e., descriptive statistics) and progress to more sophisticated approaches. Table 1 lists the examples along with the area of research and method of analysis for each example. Example 1: Structures/Bridges; Descriptive Statistics Area: Structures/bridges Method of Analysis: Descriptive statistics (exploring and presenting data to describe existing conditions and develop a basis for further analysis) 1. Research Question/Problem Statement: An engineer for a state agency wants to determine the functional and structural condition of a select number of highway bridges located across the state. Data are obtained for 100 bridges scheduled for routine inspection. The data will be used to develop bridge rehabilitation and/or replacement programs. The objective of this analysis is to provide an overview of the bridge conditions, and to present various methods to display the data in a concise and meaningful manner. Question/Issue Use collected data to describe existing conditions and prepare for future analysis. In this case, bridge inspection data from the state are to be studied and summarized. 2. Identification and Description of Variables: Bridge inspection generally entails collection of numerous variables that include location information, traffic data, structural elements’ type and condition, and functional characteristics. In this example, the variables are: bridge condition ratings of the deck, superstructure, and substructure; and overall condition of the bridge. Based on the severity of deterioration and the extent of spread through a bridge component, a condition rating is assigned on a discrete scale from 0 (failed) to 9 (excellent). These ratings (in addition to several other factors) are used in categorization of a bridge in one of three overall conditions: not deficient; structurally deficient; or functionally obsolete.

examples of effective experiment Design and Data analysis in transportation research 13 Example Area Method of Analysis 1 Structures/bridges Descriptive statistics (exploring and presenting data to describe existing conditions) 2 Public transport Descriptive statistics (organizing and presenting data to describe a system or component) 3 Environment Descriptive statistics (organizing and presenting data to explain current conditions) 4 Traffic operations Goodness of fit (chi-square test; determining if observed/collected data fit a certain distribution) 5 Construction Simple comparisons to specified values (t-test to compare the mean value of a small sample to a standard or other requirement) 6 Maintenance Simple two-sample comparison (t-test for paired comparisons; comparing the mean values of two sets of matched data) 7 Materials Simple two-sample comparisons (t-test for paired comparisons and the F-test for comparing variances) 8 Laboratory testing and/or instrumentation Simple ANOVA (comparing the mean values of more than two samples using the F-test) 9 Materials Simple ANOVA (comparing more than two mean values and the F-test for equality of means) 10 Pavements Simple ANOVA (comparing the mean values of more than two samples using the F-test) 11 Pavements Factorial design (an ANOVA approach exploring the effects of varying more than one independent variable) 12 Work zones Simple before-and-after comparisons (exploring the effect of some treatment before it is applied versus after it is applied) 13 Traffic safety Complex before-and-after comparisons using control groups (examining the effect of some treatment or application with consideration of other factors) 14 Work zones Trend analysis (examining, describing, and modeling how something changes over time) 15 Structures/bridges Trend analysis (examining a trend over time) 16 Transportation planning Multiple regression analysis (developing and testing proposed linear models with more than one independent variable) 17 Traffic operations Regression analysis (developing a model to predict the values that a dependent variable can take as a function of one or more independent variables) 18 Transportation planning Logit and related analysis (developing predictive models when the dependent variable is dichotomous) 19 Public transit Survey design and analysis (organizing survey data for statistical analysis) 20 Traffic operations Simulation (using field data to simulate or model operations or outcomes) 21 Traffic safety Non-parametric methods (methods to be used when data do not follow assumed or conventional distributions) Table 1. Examples provided in this report.

14 effective experiment Design and Data analysis in transportation research 3. Data Collection: Data are collected at 100 scheduled locations by bridge inspectors. It is important to note that the bridge condition rating scale is based on subjective categories, and there may be inherent variability among inspectors in their assignment of ratings to bridge components. A sample of data is compiled to document the bridge condition rating of the three primary structural components and the overall condition by location and ownership (Table 2). Notice that the overall condition of a bridge is not necessarily based only on the condition rating of its components (e.g., they cannot just be added). 4. Specification of Analysis Technique and Data Analysis: The two primary variables of inter- est are bridge condition rating and overall condition. The overall condition of the bridge is a categorical variable with three possible values: not deficient; structurally deficient; and functionally obsolete. The frequencies of these values in the given data set are calculated and displayed in the pie chart below. A pie chart provides a visualization of the relative proportions of bridges falling into each category that is often easier to communicate to the reader than a table showing the same information (Figure 1). Another way to look at the overall bridge condition variable is by cross-tabulation of the three condition categories with the two location categories (urban and rural), as shown in Table 3. A cross-tabulation provides the joint distribution of two (or more) variables such that each cell represents the frequency of occurrence of a specific combination of pos- sible values. For example, as seen in Table 3, there are 10 structurally deficient bridges in rural areas, which represent 11.4% of all rural area bridges inspected. The numbers in the parentheses are column percentages and add up to 100%. Table 3 also shows that 88 of the bridges inspected were located in rural areas, whereas 12 were located in urban areas. The mean values of the bridge condition rating variable for deck, superstructure, and sub- structure are shown in Table 4. These have been calculated by taking the sum of all the values and then dividing by the total number of cases (100 in this example). Generally, a condition rating Bridge No. Owner Location Bridge Condition Rating Overall Condition Deck Superstructure Substructure 1 State Rural 8 8 8 ND* 7 Local agency Rural 6 6 6 FO* 39 State Urban 6 6 2 SD* 69 State park Rural 7 5 5 SD 92 City Urban 5 6 6 ND *ND = not deficient; FO: functionally obsolete; SD: structurally deficient. Table 2. Sample bridge inspection data. Structurally Deficient (SD), 13% Functionally Obsolete (FO), 10% Neither SD/FO, 77% Figure 1. Highway bridge conditions.

examples of effective experiment Design and Data analysis in transportation research 15 of 4 or below indicates deficiency in a structural component. For the purpose of comparison, the mean bridge condition rating of the 13 structurally deficient bridges also is provided. Notice that while the rating scale for the bridge conditions is discrete with values ranging from 0 (failure) to 9 (excellent), the average bridge condition variable is continuous. Therefore, an average score of 6.47 would indicate overall condition of all bridges to be between 6 (satisfactory) and 7 (good). The combined bridge condition rating of deck, superstructure, and substructure is not defined; therefore calculating the mean of the three components’ average rating would make no sense. Also, the average bridge condition rating of functionally obsolete bridges is not calculated because other functional characteristics also accounted for this designation. The distributions of the bridge condition ratings for deck, superstructure, and substructure are shown in Figure 2. Based on the cut-off point of 4, approximately 7% of all bridge decks, 2% of all superstructures, and 5% of all substructures are deficient. 5. Interpreting the Results: The results indicate that a majority of bridges (77%) are not struc- turally or functionally deficient. The inspections were carried out on bridges primarily located in rural areas (88 out of 100). The bridge condition variable may also be cross-tabulated with the ownership variable to determine distribution by jurisdiction. The average condition ratings for the three bridge components for all bridges lies between 6 (satisfactory, some minor problems) and 7 (good, no problems noted). 6. Conclusion and Discussion: This example illustrates how to summarize and present quan- titative and qualitative data on bridge conditions. It is important to understand the mea- surement scale of variables in order to interpret the results correctly. Bridge inspection data collected over time may also be analyzed to determine trends in the condition of bridges in a given area. Trend analysis is addressed in Example 15 (structures). 7. Applications in Other Areas of Transportation Research: Descriptive statistics could be used to present data in other areas of transportation research, such as: • Transportation Planning—to assess the distribution of travel times between origin- destination pairs in an urban area. Overall averages could also be calculated. • Traffic Operations—to analyze the average delay per vehicle at a railroad crossing. Rating Category Mean Value Overall average bridge condition rating (deck) 6.20 Overall average bridge condition rating (superstructure) 6.47 Overall average bridge condition rating (substructure) 6.08 Average bridge condition rating of structurally deficient bridges (deck) 4.92 Average bridge condition rating of structurally deficient bridges (superstructure) 5.30 Average bridge condition rating of structurally deficient bridges (substructure) 4.54 Table 4. Bridge condition ratings. Rural Urban Total Structurally deficient 10 (11.4%) 3 (25.0%) 13 Functionally obsolete 6 (6.8%) 4 (33.3%) 10 Not deficient 72 (81.8%) 5 (41.7%) 77 Total 88 (100%) 12 (100%) 100 Table 3. Cross-tabulation of bridge condition by location.

16 effective experiment Design and Data analysis in transportation research • Traffic Operations/Safety—to examine the frequency of turning violations at driveways with various turning restrictions. • Work Zones, Environment—to assess the average energy consumption during various stages of construction. Example 2: Public Transport; Descriptive Statistics Area: Public transport Method of Analysis: Descriptive statistics (organizing and presenting data to describe a system or component) 1. Research Question/Problem Statement: The manager of a transit agency would like to present information to the board of commissioners on changes in revenue that resulted from a change in the fare. The transit system provides three basic types of service: local bus routes, express bus routes, and demand-responsive bus service. There are 15 local bus routes, 10 express routes, and 1 demand-responsive system. 0 5 10 15 20 25 30 35 40 45 9 8 7 6 5 4 3 2 1 0 Condition Ratings Pe rc en ta ge o f S tru ctu re s Deck Superstructure Substructure Figure 2. Bridge condition ratings. Question/Issue Use data to describe some change over time. In this instance, data from 2008 and 2009 are used to describe the change in revenue on each route/part of a transit system when the fare structure was changed from variable (per mile) to fixed fares. 2. Identification and Description of Variables: Revenue data are available for each route on the local and express bus system and the demand-responsive system as a whole for the years 2008 and 2009. 3. Data Collection: Revenue data were collected on each route for both 2008 and 2009. The annual revenue for the demand-responsive system was also collected. These data are shown in Table 5. 4. Specification of Analysis Technique and Data Analysis: The objective of this analysis is to present the impact of changing the fare system in a series of graphs. The presentation is intended to show the impact on each component of the transit system as well as the impact on overall system revenue. The impact of the fare change on the overall revenue is best shown with a bar graph (Figure 3). The variation in the impact across system components can be illustrated in a similar graph (Figure 4). A pie chart also can be used to illustrate the relative impact on each system component (Figure 5).

examples of effective experiment Design and Data analysis in transportation research 17 Bus Route 2008 Revenue 2009 Revenue Local Route 1 $350,500 $365,700 Local Route 2 $263,000 $271,500 Local Route 3 $450,800 $460,700 Local Route 4 $294,300 $306,400 Local Route 5 $173,900 $184,600 Local Route 6 $367,800 $375,100 Local Route 7 $415,800 $430,300 Local Route 8 $145,600 $149,100 Local Route 9 $248,200 $260,800 Local Route 10 $310,400 $318,300 Local Route 11 $444,300 $459,200 Local Route 12 $208,400 $205,600 Local Route 13 $407,600 $412,400 Local Route 14 $161,500 $169,300 Local Route 15 $325,100 $340,200 Express Route 1 $85,400 $83,600 Express Route 2 $110,300 $109,200 Express Route 3 $65,800 $66,200 Express Route 4 $125,300 $127,600 Express Route 5 $90,800 $90,400 Express Route 6 $125,800 $123,400 Express Route 7 $87,200 $86,900 Express Route 8 $68.300 $67,200 Express Route 9 $110,100 $112,300 Express Route 10 $73,200 $72,100 Demand-Responsive System $510,100 $521,300 Table 5. Revenue by route or type of service and year. 6.02 6.17 0 1 2 3 4 5 6 7 8 2008 2009 Total System Revenue Re ve nu e (M illi on $ ) Figure 3. Impact of fare change on overall revenue.

18 effective experiment Design and Data analysis in transportation research Express Buses, 15.7% Express Buses, 15.2% Local Buses, 76.3% Local Buses, 75.8% Demand Responsive, 8.5% Demand Responsive, 8.5% 2008 2009 Figure 5. Pie charts illustrating percent of revenue from each component of a transit system. If it is important to display the variability in the impact within the various bus routes in the local bus or express bus operations, this also can be illustrated (Figure 6). This type of diagram shows the maximum value, minimum value, and mean value of the percent increase in revenue across the 15 local bus routes and the 10 express bus routes. 5. Interpreting the results: These results indicate that changing from a variable fare based on trip length (2008) to a fixed fare (2009) on both the local bus routes and the express bus routes had little effect on revenue. On the local bus routes, there was an average increase in revenue of 3.1%. On the express bus routes, there was an average decrease in revenue of 0.4%. These changes altered the percentage of the total system revenue attributed to the local bus routes and the express bus routes. The local bus routes generated 76.3% of the revenue in 2009, compared to 75.8% in 2008. The percentage of revenue generated by the express bus routes dropped from 15.7% to 15.2%, and the demand-responsive system generated 8.5% in both 2008 and 2009. 6. Conclusion and Discussion: The total revenue increased from $6.02 million to $6.17 mil lion. The cost of operating a variable fare system is greater than that of operating a fixed fare system— hence, net income probably increased even more (more revenue, lower cost for fare collection), and the decision to modify the fare system seems reasonable. Notice that the entire discussion Figure 4. Variation in impact of fare change across system components. 0.94 0.51 0.94 0.52 4.57 4.71 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 Local Buses Express Buses Demand Responsive Re ve nu e (M illi on $ ) 2008 2009

examples of effective experiment Design and Data analysis in transportation research 19 also is based on the assumption that no other factors changed between 2008 and 2009 that might have affected total revenues. One of the implicit assumptions is that the number of riders remained relatively constant from 1 year to the next. If the ridership had changed, the statistics reported would have to be changed. Using the measure revenue/rider, for example, would help control (or normalize) for the variation in ridership. 7. Applications in Other Areas in Transportation Research: Descriptive statistics are widely used and can convey a great deal of information to a reader. They also can be used to present data in many areas of transportation research, including: • Transportation Planning—to display public response frequency or percentage to various alternative designs. • Traffic Operations—to display the frequency or percentage of crashes by route type or by the type of traffic control devices present at an intersection. • Airport Engineering—to display the arrival pattern of passengers or flights by hour or other time period. • Public Transit—to display the average load factor on buses by time of day. Example 3: Environment; Descriptive Statistics Area: Environment Method of Analysis: Descriptive statistics (organizing and presenting data to explain current conditions) 1. Research Question/Problem Statement: The planning and programming director in Envi- ronmental City wants to determine the current ozone concentration in the city. These data will be compared to data collected after the projects included in the Transportation Improvement Program (TIP) have been completed to determine the effects of these projects on the environ- ment. Because the terrain, the presence of hills or tall buildings, the prevailing wind direction, and the sample station location relative to high volume roads or industrial sites all affect the ozone level, multiple samples are required to determine the ozone concentration level in a city. For this example, air samples are obtained each weekday in the month of July (21 days) at 14 air-sampling stations in the city: 7 in the central city and 7 in the outlying areas of the city. The objective of the analysis is to determine the ozone concentration in the central city, the outlying areas of the city, and the city as a whole. Figure 6. Graph showing variation in revenue increase by type of bus route. -0.4 -1.3 -2.1 3.1 6.2 2.0 -3 -2 -1 0 1 2 3 4 5 6 7 Local Bus Routes Express Bus Routes Percent Increase in Revenue

20 effective experiment Design and Data analysis in transportation research 2. Identification and Description of Variables: The variable to be analyzed is the 8-hour average ozone concentration in parts per million (ppm) at each of the 14 air-sampling stations. The 8-hour average concentration is the basis for the EPA standard, and July is selected because ozone levels are temperature sensitive and increase with a rise in the temperature. 3. Data Collection: Ozone concentrations in ppm are recorded for each hour of the day at each of the 14 air-sampling stations. The highest average concentration for any 8-hour period during the day is recorded and tabulated. This results in 294 concentration observations (14 stations for 21 days). Table 6 and Table 7 show the data for the seven central city locations and the seven outlying area locations. 4. Specification of Analysis Technique and Data Analysis: Much of the data used in analyzing transportation issues has year-to-year, month-to-month, day-to-day, and even hour-to-hour variations. For this reason, making only one observation, or even a few observations, may not accurately describe the phenomenon being observed. Thus, standard practice is to obtain several observations and report the mean value of all observations. In this example, the phenomenon being observed is the daily ozone concentration at a series of air-sampling locations. The statistic to be estimated is the mean value of this variable over Question/Issue Use collected data to describe existing conditions and prepare for future analysis. In this example, air pollution levels in the central city, the outlying areas, and the overall city are to be described. Day Station 1 2 3 4 5 6 7 ∑ 1 0.079 0.084 0.081 0.083 0.088 0.086 0.089 0.590 2 0.082 0.087 0.088 0.086 0.086 0.087 0.081 0.597 3 0.080 0.081 0.077 0.072 0.084 0.083 0.081 0.558 4 0.083 0.086 0.082 0.079 0.086 0.087 0.089 0.592 5 0.082 0.087 0.080 0.075 0.090 0.089 0.085 0.588 6 0.075 0.084 0.079 0.076 0.080 0.083 0.081 0.558 7 0.078 0.079 0.080 0.074 0.078 0.080 0.075 0.544 8 0.081 0.077 0.082 0.081 0.076 0.079 0.074 0.540 9 0.088 0.084 0.083 0.085 0.083 0.083 0.088 0.594 10 0.085 0.087 0.086 0.089 0.088 0.087 0.090 0.612 11 0.079 0.082 0.082 0.089 0.091 0.089 0.090 0.602 12 0.078 0.080 0.081 0.086 0.088 0.089 0.089 0.591 13 0.081 0.079 0.077 0.083 0.084 0.085 0.087 0.576 14 0.083 0.080 0.079 0.081 0.080 0.082 0.083 0.568 15 0.084 0.083 0.080 0.085 0.082 0.086 0.085 0.585 16 0.086 0.087 0.085 0.087 0.089 0.090 0.089 0.613 17 0.082 0.085 0.083 0.090 0.087 0.088 0.089 0.604 18 0.080 0.081 0.080 0.087 0.085 0.086 0.088 0.587 19 0.080 0.083 0.077 0.083 0.085 0.084 0.087 0.579 20 0.081 0.084 0.079 0.082 0.081 0.083 0.088 0.578 21 0.082 0.084 0.080 0.081 0.082 0.083 0.085 0.577 ∑ 1.709 1.744 1.701 1.734 1.773 1.789 1.793 12.243 Table 6. Central city 8-hour ozone concentration samples (ppm).

examples of effective experiment Design and Data analysis in transportation research 21 the test period selected. The mean value of any data set (x _ ) equals the sum of all observations in the set divided by the total number of observations in the set (n): x x n i i n = = ∑ 1 The variables of interest stated in the research question are the average ozone concentration for the central city, the outlying areas, and the total city. Thus, there are three data sets: the first table, the second table, and the sum of the two tables. The first data set has a sample size of 147; the second data set also has a sample size of 147, and the third data set contains 294 observations. Using the formula just shown, the mean value of the ozone concentration in the central city is calculated as follows: x xi i = = = = ∑ 147 12 243 147 0 083 1 147 . . ppm The mean value of the ozone concentration in the outlying areas of the city is: x xi i = = = = ∑ 147 10 553 147 0 072 1 147 . . ppm The mean value of the ozone concentration for the entire city is: x xi i = = = = ∑ 294 22 796 294 0 078 1 294 . . ppm Day Station 8 9 10 11 12 13 14 ∑ 1 0.072 0.074 0.073 0.071 0.079 0.070 0.074 0.513 2 0.074 0.075 0.077 0.075 0.081 0.075 0.077 0.534 3 0.070 0.072 0.074 0.074 0.083 0.078 0.080 0.531 4 0.067 0.070 0.071 0.077 0.080 0.077 0.081 0.523 5 0.064 0.067 0.068 0.072 0.079 0.078 0.079 0.507 6 0.069 0.068 0.066 0.070 0.075 0.079 0.082 0.509 7 0.071 0.069 0.070 0.071 0.074 0.071 0.077 0.503 8 0.073 0.072 0.074 0.072 0.076 0.073 0.078 0.518 9 0.072 0.075 0.077 0.074 0.078 0.074 0.080 0.530 10 0.074 0.077 0.079 0.077 0.080 0.076 0.079 0.542 11 0.070 0.072 0.075 0.074 0.079 0.074 0.078 0.522 12 0.068 0.067 0.068 0.070 0.074 0.070 0.075 0.492 13 0.065 0.063 0.067 0.068 0.072 0.067 0.071 0.473 14 0.063 0.062 0.067 0.069 0.073 0.068 0.073 0.475 15 0.064 0.064 0.066 0.067 0.070 0.066 0.070 0.467 16 0.061 0.059 0.062 0.062 0.067 0.064 0.069 0.434 17 0.065 0.061 0.060 0.064 0.069 0.066 0.073 0.458 18 0.067 0.063 0.065 0.068 0.073 0.069 0.076 0.499 19 0.069 0.067 0.068 0.072 0.077 0.071 0.078 0.502 20 0.071 0.069 0.070 0.074 0.080 0.074 0.077 0.515 21 0.070 0.065 0.072 0.076 0.079 0.073 0.079 0.514 ∑ 1.439 1.431 1.409 1.497 1.598 1.513 1.606 10.553 Table 7. Outlying area 8-hour ozone concentration samples (ppm).

22 effective experiment Design and Data analysis in transportation research Using the same equation, the mean value for each air-sampling location can be found by summing the value of the ozone concentration in the column representing that location and dividing by the 21 observations at that location. For example, considering Sample Station 1, the mean value of the ozone concentration is 1.709/21 = 0.081 ppm. Similarly, the mean value of the ozone concentrations for any specific day can be found by summing the ozone concentration values in the row representing that day and dividing by the number of stations. For example, for Day 1, the mean value of the ozone concentration in the central city is 0.590/7=0.084. In the outlying areas of the city, it is 0.513/7=0.073, and for the entire city it is 1.103/14=0.079. The highest and lowest values of the ozone concentration can be obtained by searching the two tables. The highest ozone concentration (0.091 ppm) is logged as having occurred at Station 5 on Day 11. The lowest ozone concentration (0.059 ppm) occurred at Station 9 on Day 16. The variation by sample location can be illustrated in the form of a frequency diagram. A graph can be used to show the variation in the average ozone concentration for the seven sample stations in the central city (Figure 7). Notice that all of these calculations (and more) can be done very easily if all the data are put in a spreadsheet and various statistical functions used. Graphs and other displays also can be made within the spreadsheet. 5. Interpreting the Results: In this example, the data are not tested to determine whether they fit a known distribution or whether one average value is significantly higher or lower than another. It can only be reported that, as recorded in July, the mean ozone concentration in the central city was greater than the concentration in the outlying areas of the city. (For testing to see whether the data fit a known distribution or comparing mean values, see Example 4 on fitting distribu- tions and goodness of fit. For comparing mean values, see examples 5 through 7.) It is known that ozone concentration varies by day and by location of the air-sampling equipment. If there is some threshold value of importance, such as the ozone concentration level considered acceptable by the EPA, these data could be used to determine the number of days that this level was exceeded, or the number of stations that recorded an ozone concentration above this threshold. This is done by comparing each day or each station with the threshold 0.081 0.083 0.081 0.083 0.084 0.085 0.085 0.070 0.072 0.074 0.076 0.078 0.080 0.082 0.084 0.086 1 2 3 4 5 6 7 Station A ve ra ge o zo ne c on ce nt ra tio n Figure 7. Average ozone concentration for seven central city sampling stations (ppm).

examples of effective experiment Design and Data analysis in transportation research 23 value. It must be noted that, as presented, this example is not a statistical comparison per se (i.e., there has been no significance testing or formal statistical comparison). 6. Conclusion and Discussion: This example illustrates how to determine and present quanti- tative information about a data set containing values of a varying parameter. If a similar set of data were captured each month, the variation in ozone concentration could be analyzed to describe the variation over the year. Similarly, if data were captured at these same locations in July of every year, the trend in ozone concentration over time could be determined. 7. Applications in Other Areas in Transportation: These descriptive statistics techniques can be used to present data in other areas of transportation research, such as: • Traffic Operations/Safety and Transportation Planning – to analyze the average speed of vehicles on streets with a speed limit of 45 miles per hour (mph) in residential, commercial, and industrial areas by sampling a number of streets in each of these area types. – to examine the average emergency vehicle response time to various areas of the city or county, by analyzing dispatch and arrival times for emergency calls to each area of interest. • Pavement Engineering—to analyze the average number of potholes per mile on pavement as a function of the age of pavement, by sampling a number of streets where the pavement age falls in discrete categories (0 to 5 years, 5 to 10 years, 10 to 15 years, and greater than 15 years). • Traffic Safety—to evaluate the average number of crashes per month at intersections with two-way STOP control versus four-way STOP control by sampling a number of intersections in each category over time. Example 4: Traffic Operations; Goodness of Fit Area: Traffic operations Method of Analysis: Goodness of fit (chi-square test; determining if observed distributions of data fit hypothesized standard distributions) 1. Research Question/Problem Statement: A research team is developing a model to estimate travel times of various types of personal travel (modes) on a path shared by bicyclists, in-line skaters, and others. One version of the model relies on the assertion that the distribution of speeds for each mode conforms to the normal distribution. (For a helpful definition of this and other statistical terms, see the glossary in NCHRP Project 20-45, Volume 2, Appendix A.) Based on a literature review, the researchers are sure that bicycle speeds are normally distributed. However, the shapes of the speed distributions for other users are unknown. Thus, the objective is to determine if skater speeds are normally distributed in this instance. Question/Issue Do collected data fit a specific type of probability distribution? In this example, do the speeds of in-line skaters on a shared-use path follow a normal distribution (are they normally distributed)? 2. Identification and Description of Variables: The only variable collected is the speed of in-line skaters passing through short sections of the shared-use path. 3. Data Collection: The team collects speeds using a video camera placed where most path users would not notice it. The speed of each free-flowing skater (i.e., each skater who is not closely following another path user) is calculated from the times that the skater passes two benchmarks on the path visible in the camera frame. Several days of data collection allow a large sample of 219 skaters to be measured. (An implicit assumption is made that there is no

24 effective experiment Design and Data analysis in transportation research variation in the data by day.) The data have a familiar bell shape; that is, when graphed, they look like they are normally distributed (Figure 8). Each bar in the figure shows the number of observations per 1.00-mph-wide speed bin. There are 10 observations between 6.00 mph and 6.99 mph. 4. Specification of Analysis Technique and Data Analysis: This analysis involves several pre- liminary steps followed by two major steps. In the preliminaries, the team calculates the mean and standard deviation from the data sample as 10.17 mph and 2.79 mph, respectively, using standard formulas described in NCHRP Project 20-45, Volume 2, Chapter 6, Section C under the heading “Frequency Distributions, Variance, Standard Deviation, Histograms, and Boxplots.” Then the team forms bins of observations of sufficient size to conduct the analysis. For this analysis, the team forms bins containing at least four observations each, which means forming a bin for speeds of 5 mph and lower and a bin for speeds of 17 mph or higher. There is some argument regarding the minimum allowable cell size. Some analysts argue that the minimum is five; others argue that the cell size can be smaller. Smaller numbers of observations in a bin may distort the results. When in doubt, the analysis can be done with different assumptions regarding the cell size. The left two columns in Table 8 show the data ready for analysis. The first major step of the analysis is to generate the theoretical normal distribution to compare to the field data. To do this, the team calculates a value of Z, the standard normal variable for each bin i, using the following equation: Z xi = − µ σ where x is the speed in miles per hour (mph) corresponding to the bin, µ is the mean speed, and s is the standard deviation of all of the observations in the speed sample in mph. For example (and with reference to the data in Table 8), for a speed of 5 mph the value of Z will be (5 - 10.17)/2.79 = -1.85 and for a speed of 6 mph, the value of Z will be (6 - 10.17)/2.79 = -1.50. The team then consults a table of standard normal values (i.e., NCHRP Project 20-45, Volume 2, Appendix C, Table C-1) to convert these Z values into A values representing the area under the standard normal distribution curve. The A value for a Z of -1.85 is 0.468, while the A value for a Z of -1.50 is 0.432. The difference between these two A values, representing the area under the standard normal probability curve corresponding to the speed of 6 mph, is 0.036 (calculated 0.468 - 0.432 = 0.036). The team multiplies 0.036 by the total sample size (219), to estimate that there should be 7.78 skaters with a speed of 6 mph if the speeds follow the standard normal distribution. The team follows Figure 8. Distribution of observed in-line skater speeds. 0 5 10 15 20 25 30 35 40 1 3 5 7 9 11 13 15 17 232119 Speed, mph Nu m be r o f o bs er va tio ns

examples of effective experiment Design and Data analysis in transportation research 25 a similar procedure for all speeds. Notice that the areas under the curve can also be calculated in a simple Excel spreadsheet using the “NORMDIST” function for a given x value and the average speed of 10.17 and standard deviation of 2.79. The values shown in Table 8 have been estimated using the Excel function. The second major step of the analysis is to use the chi-square test (as described in NCHRP Project 20-45, Volume 2, Chapter 6, Section F) to determine if the theoretical normal distribution is significantly different from the actual data distribution. The team computes a chi-square value for each bin i using the formula: χi i i i O E E 2 2 = −( ) where Oi is the number of actual observations in bin i and Ei is the expected number of obser- vations in bin i estimated by using the theoretical distribution. For the bin of 6 mph speeds, O = 10 (from the table), E = 7.78 (calculated), and the ci2 contribution for that cell is 0.637. The sum of the ci2 values for all bins is 19.519. The degrees of freedom (df) used for this application of the chi-square test are the number of bins minus 1 minus the number of variables in the distribution of interest. Given that the normal distribution has two variables (see May, Traffic Flow Fundamentals, 1990, p. 40), in this example the degrees of freedom equal 9 (calculated 12 - 1 - 2 = 9). From a standard table of chi-square values (NCHRP Project 20-45, Volume 2, Appendix C, Table C-2), the team finds that the critical value at the 95% confidence level for this case (with df = 9) is 16.9. The calculated value of the statistic is ~19.5, more than the tabular value. The results of all of these observations and calculations are shown in Table 8. 5. Interpreting the Results: The calculated chi-square value of ~19.5 is greater than the criti- cal chi-square value of 16.9. The team concludes, therefore, that the normal distribution is significantly different from the distribution of the speed sample at the 95% level (i.e., that the in-line skater speed data do not appear to be normally distributed). Larger variations between the observed and expected distributions lead to higher values of the statistic and would be interpreted as it being less likely that the data are distributed according to the Speed (mph) Number of Observations Number Predicted by Normal Distribution Chi-Square Value Under 5.99 6 6.98 0.137 6.00 to 6.99 10 7.78 0.637 7.00 to 7.99 18 13.21 1.734 8.00 to 8.99 24 19.78 0.902 9.00 to 9.99 37 26.07 4.585 10.00 to 10.99 38 30.26 1.980 11.00 to 11.99 24 30.93 1.554 12.00 to 12.99 21 27.85 1.685 13.00 to 13.99 15 22.08 2.271 14.00 to 14.99 13 15.42 0.379 15.00 to 15.99 4 9.48 3.169 16.00 to 16.99 4 5.13 0.251 17.00 and over 5 4.03 0.234 Total 219 219 19.519 Table 8. Observations, theoretical predictions, and chi-square values for each bin.

26 effective experiment Design and Data analysis in transportation research hypothesized distribution. Conversely, smaller variations between observed and expected distributions result in lower values of the statistic, which would suggest that it is more likely that the data are normally distributed because the observed values would fit better with the expected values. 6. Conclusion and Discussion: In this case, the results suggest that the normal distribution is not a good fit to free-flow speeds of in-line skaters on shared-use paths. Interestingly, if the 23 mph observation is considered to be an outlier and discarded, the results of the analysis yield a different conclusion (that the data are normally distributed). Some researchers use a simple rule that an outlier exists if the observation is more than three standard deviations from the mean value. (In this example, the 23 mph observation is, indeed, more than three standard deviations from the mean.) If there is concern with discarding the observation as an outlier, it would be easy enough in this example to repeat the data collection exercise. Looking at the data plotted above, it is reasonably apparent that the well-known normal distribution should be a good fit (at least without the value of 23). However, the results from the statistical test could not confirm the suspicion. In other cases, the type of distribution may not be so obvious, the distributions in question may be obscure, or some distribution parameters may need to be calibrated for a good fit. In these cases, the statistical test is much more valuable. The chi-square test also can be used simply to compare two observed distributions to see if they are the same, independent of any underlying probability distribution. For example, if it is desired to know if the distribution of traffic volume by vehicle type (e.g., automobiles, light trucks, and so on) is the same at two different freeway locations, the two distributions can be compared to see if they are similar. The consequences of an error in the procedure outlined here can be severe. This is because the distributions chosen as a result of the procedure often become the heart of predictive models used by many other engineers and planners. A poorly-chosen distribution will often provide erroneous predictions for many years to come. 7. Applications in Other Areas of Transportation Research: Fitting distributions to data samples is important in several areas of transportation research, such as: • Traffic Operations—to analyze shapes of vehicle headway distributions, which are of great interest, especially as a precursor to calibrating and using simulation models. • Traffic Safety—to analyze collision frequency data. Analysts often assume that the Poisson distribution is a good fit for collision frequency data and must use the method described here to validate the claim. • Pavement Engineering—to form models of pavement wear or otherwise compare results obtained using different designs, as it is often required to check the distributions of the parameters used (e.g., roughness). Example 5: Construction; Simple Comparisons to Specified Values Area: Construction Method of Analysis: Simple comparisons to specified values—using Student’s t-test to compare the mean value of a small sample to a standard or other requirement (i.e., to a population with a known mean and unknown standard deviation or variance) 1. Research Question/Problem Statement: A contractor wants to determine if a specified soil compaction can be achieved on a segment of the road under construction by using an on-site roller or if a new roller must be brought in.

examples of effective experiment Design and Data analysis in transportation research 27 The cost of obtaining samples for many construction materials and practices is quite high. As a result, decisions often must be made based on a small number of samples. The appropri- ate statistical technique for comparing the mean value of a small sample with a standard or requirement is Student’s t-test. Formally, the working, or null, hypothesis (Ho) and the alternative hypothesis (Ha) can be stated as follows: Ho: The soil compaction achieved using the on-site roller (CA) is less than a specified value (CS); that is, (CA < CS). Ha: The soil compaction achieved using the on-site roller (CA) is greater than or equal to the specified value (CS); that is, (CA ≥ CS). Question/Issue Determine whether a sample mean exceeds a specified value. Alternatively, deter- mine the probability of obtaining a sample mean (x _ ) from a sample of size n, if the universe being sampled has a true mean less than or equal to a population mean with an unknown variance. In this example, is an observed mean of soil compaction samples equal to or greater than a specified value? 2. Identification and Description of Variables: The variable to be used is the soil density results of nuclear densometer tests. These values will be used to determine whether the use of the on-site roller is adequate to meet the contract-specified soil density obtained in the laboratory (Proctor density) of 95%. 3. Data Collection: A 125-foot section of road is constructed and compacted with the on-site roller, and four samples of the soil density are obtained (25 feet, 50 feet, 75 feet, and 100 feet from the beginning of the test section). 4. Specification of Analysis Technique and Data Analysis: For small samples (n < 30) where the population mean is known but the population standard deviation is unknown, it is not appropriate to describe the distribution of the sample mean with a normal distribution. The appropriate distribution is called Student’s distribution (t-distribution or t-statistic). The equation for Student’s t-statistic is: t x x S n = − ′ where x _ is the sample mean, x _ ′ is the population mean (or specified standard), S is the sample standard deviation, and n is the sample size. The four nuclear densometer readings were 98%, 97%, 93% and 99%. Then, showing some simple sample calculations, X X S X i i i n = = + + + = = = = = ∑ 4 98 97 93 99 4 387 4 96 75 1 4 1 . % Σ i X n S −( ) − = = 2 1 20 74 3 2 63 . . %

28 effective experiment Design and Data analysis in transportation research and using the equation for t above, t = − = = 96 75 95 00 2 63 2 1 75 1 32 1 33 . . . . . . The calculated value of the t-statistic (1.33) is most typically compared to the tabularized values of the t-statistic (e.g., NCHRP Project 20-45, Volume 2, Appendix C, Table C-4) for a given significance level (typically called t critical or tcrit). For a sample size of n = 4 having 3 (n - 1) degrees of freedom (df), the values for tcrit are: 1.638 for a = 0.10 and 2.353 for a = 0.05 (two common values of a for testing, the latter being most common). Important: The specification of the significance level (a level) for testing should be done before actual testing and interpretation of results are done. In many instances, the appropriate level is defined by the agency doing the testing, a specified testing standard, or simply common practice. Generally speaking, selection of a smaller value for a (e.g., a = 0.05 versus a = 0.10) sets a more stringent standard. In this example, because the calculated value of t (1.33) is less than the critical value (2.353, given a = 0.05), the null hypothesis is accepted. That is, the engineer cannot be confident that the mean value from the densometer tests (96.75%) is greater than the required specifica- tion (95%). If a lower confidence level is chosen (e.g., a = 0.15), the value for tcrit would change to 1.250, which means the null hypothesis would be rejected. A lower confidence level can have serious implications. For example, there is an approximately 15% chance that the standard will not be met. That level of risk may or may not be acceptable to the contractor or the agency. Notice that in many standards the required significance level is stated (typically a = 0.05). It should be emphasized that the confidence level should be chosen before calculations and testing are done. It is not generally permissible to change the confidence level after calculations have been performed. Doing this would be akin to arguing that standards can be relaxed if a test gives an answer that the analyst doesn’t like. The results of small sample tests often are sensitive to the number of samples that can be obtained at a reasonable cost. (The mean value may change considerably as more data are added.) In this example, if it were possible to obtain nine independent samples (as opposed to four) and the mean value and sample standard deviation were the same as with the four samples, the calculation of the t-statistic would be: t = − = 96 75 95 00 2 63 3 1 99 . . . . Comparing the value of t (with a larger sample size) to the appropriate tcrit (for n - 1 = 8 df and a = 0.05) of 1.860 changes the outcome. That is, the calculated value of the t-statistic is now larger than the tabularized value of tcrit, and the null hypothesis is rejected. Thus, it is accepted that the mean of the densometer readings meets or exceeds the standard. It should be noted, however, that the inclusion of additional tests may yield a different mean value and standard deviation, in which case the results could be different. 5. Interpreting the Results: By themselves, the results of the statistical analysis are insufficient to answer the question as to whether a new roller should be brought to the project site. These results only provide information the contractor can use to make this decision. The ultimate decision should be based on these probabilities and knowledge of the cost of each option. What is the cost of bringing in a new roller now? What is the cost of starting the project and then determining the current roller is not adequate and then bringing in a new roller? Will this decision result in a delay in project completion—and does the contract include an incentive for early completion and/or a penalty for missing the completion date? If it is possible to conduct additional independent densometer tests, what is the cost of conducting them?

examples of effective experiment Design and Data analysis in transportation research 29 If there is a severe penalty for missing the deadline (or a significant reward for finishing early), the contractor may be willing to incur the cost of bringing in a new roller rather than accepting a 15% probability of being delayed. 6. Conclusion and Discussion: In some cases the decision about which alternative is preferable can be expressed in the form of a probability (or level of confidence) required to make a deci- sion. The decision criterion is then expressed in a hypothesis and the probability of rejecting that hypothesis. In this example, if the hypothesis to be tested is “Using the on-site roller will provide an average soil density of 95% or higher” and the level of confidence is set at 95%, given a sample of four tests the decision will be to bring in a new roller. However, if nine independent tests could be conducted, the results in this example would lead to a decision to use the on-site roller. 7. Applications in Other Areas in Transportation Research: Simple comparisons to specified values can be used in a variety of areas of transportation research. Some examples include: • Traffic Operations—to compare the average annual number of crashes at intersections with roundabouts with the average annual number of crashes at signalized intersections. • Pavement Engineering—to test the comprehensive strength of concrete slabs. • Maintenance—to test the results of a proposed new deicer compound. Example 6: Maintenance; Simple Two-Sample Comparisons Area: Maintenance Method of Analysis: Simple two-sample comparisons (t-test for paired comparisons; com- paring the mean values of two sets of matched data) 1. Research Question/Problem Statement: As a part of a quality control and quality assurance (QC/QA) program for highway maintenance and construction, an agency engineer wants to compare and identify discrepancies in the contractor’s testing procedures or equipment in making measurements on materials being used. Specifically, compacted air voids in asphalt mixtures are being measured. In this instance, the agency’s test results need to be compared, one-to-one, with the contractor’s test results. Samples are drawn or made and then literally split and tested—one by the contractor, one by the agency. Then the pairs of measurements are analyzed. A paired t-test will be used to make the comparison. (For another type of two-sample comparison, see Example 7.) Question/Issue Use collected data to test if two sets of results are similar. Specifically, do two test- ing procedures to determine air voids produce the same results? Stated in formal terms, the null and alternative hypotheses are: Ho: There is no mean difference in air voids between agency and contractor test results: H Xo d: = 0 Ha: There is a mean difference in air voids between agency and contractor test results: H Xa d: ≠ 0 (For definitions and more discussion about the formulation of formal hypotheses for test- ing, see NCHRP Project 20-45, Volume 2, Appendix A and Volume 1, Chapter 2, “Hypothesis.”) 2. Identification and Description of Variables: The testing procedure for laboratory-compacted air voids in the asphalt mixture needs to be verified. The split-sample test results for laboratory-

30 effective experiment Design and Data analysis in transportation research compacted air voids are shown in Table 9. Twenty samples are prepared using the same asphalt mixture. Half of the samples are prepared in the agency’s laboratory and the other half in the contractor’s laboratory. Given this arrangement, there are basically two variables of concern: who did the testing and the air void determination. 3. Data Collection: A sufficient quantity of asphalt mix to make 10 lots is produced in an asphalt plant located on a highway project. Each of the 10 lots is collected, split into two samples, and labeled. A sample from each lot, 4 inches in diameter and 2 inches in height, is prepared in the contractor’s laboratory to determine the air voids in the compacted samples. A matched set of samples is prepared in the agency’s laboratory and a similar volumetric procedure is used to determine the agency’s lab-compacted air voids. The lab-compacted air void contents in the asphalt mixture for both the contractor and agency are shown in Table 9. 4. Specification of Analysis Technique and Data Analysis: A paired (two-sided) t-test will be used to determine whether a difference exists between the contractor and agency results. As noted above, in a paired t-test the null hypothesis is that the mean of the differences between each pair of two tests is 0 (there is no difference between the means). The null hypothesis can be expressed as follows: H Xo d: = 0 The alternate hypothesis, that the two means are not equal, can be expressed as follows: H Xa d: ≠ 0 The t-statistic for the paired measurements (i.e., the difference between the split-sample test results) is calculated using the following equation: t X s n d d = − 0 Using the actual data, the value of the t-statistic is calculated as follows: t = − = 0 88 0 0 7 10 4 . . Sample Air Voids (%) DifferenceContractor Agency 1 4.37 4.15 0.21 2 3.76 5.39 -1.63 3 4.10 4.47 -0.37 4 4.39 4.52 -0.13 5 4.06 5.36 -1.29 6 4.14 5.01 -0.87 7 3.92 5.23 -1.30 8 3.38 4.97 -1.60 9 4.12 4.37 -0.25 10 3.68 5.29 -1.61 X 3.99 4.88 dX = -0.88 S 0.31 0.46 ds = 0.70 Table 9. Laboratory-compacted air voids in split samples.

examples of effective experiment Design and Data analysis in transportation research 31 For n - 1 (10 - 1 = 9) degrees of freedom and a = 0.05, the tcrit value can be looked up using a t-table (e.g., NCHRP Project 20-45, Volume 2, Appendix C, Table C-4): t0 025 9 2 262. , .= For a more detailed description of the t-statistic, see the glossary in NCHRP Project 20-45, Volume 2, Appendix A. 5. Interpreting the Results: Given that t = 4 > t0.025, 9 = 2.685, the engineer would reject the null hypothesis and conclude that the results of the paired tests are different. This means that the contractor and agency test results from paired measurements indicate that the test method, technicians, and/or test equipment are not providing similar results. Notice that the engineer cannot conclude anything about the material or production variation or what has caused the differences to occur. 6. Conclusion and Discussion: The results of the test indicate that a statistically significant difference exists between the test results from the two groups. When making such comparisons, it is important that random sampling be used when obtaining the samples. Also, because sources of variability influence the population parameters, the two sets of test results must have been sampled over the same time period, and the same sampling and testing procedures must have been used. It is best if one sample is drawn and then literally split in two, then another sample drawn, and so on. The identification of a difference is just that: notice that a difference exists. The reason for the difference must still be determined. A common misinterpretation is that the result of the t-test provides the probability of the null hypothesis being true. Another way to look at the t-test result in this example is to conclude that some alternative hypothesis provides a better description of the data. The result does not, however, indicate that the alternative hypothesis is true. To ensure practical significance, it is necessary to assess the magnitude of the difference being tested. This can be done by computing confidence intervals, which are used to quantify the range of effect size and are often more useful than simple hypothesis testing. Failure to reject a hypothesis also provides important information. Possible explanations include: occurrence of a type-II error (erroneous acceptance of the null hypothesis); small sample size; difference too small to detect; expected difference did not occur in data; there is no difference/effect. Proper experiment design and data collection can minimize the impact of some of these issues. (For a more comprehensive discussion of this topic, see NCHRP Project 20-45, Volume 2, Chapter 1.) 7. Applications in Other Areas of Transportation Research: The application of the t-test to compare two mean values in other areas of transportation research may include: • Traffic Operations—to evaluate average delay in bus arrivals at various bus stops. • Traffic Operations/Safety—to determine the effect of two enforcement methods on reduction in a particular traffic violation. • Pavement Engineering—to investigate average performance of two pavement sections. • Environment—to compare average vehicular emissions at two locations in a city. Example 7: Materials; Simple Two-Sample Comparisons Area: Materials Method of Analysis: Simple two-sample comparisons (using the t-test to compare the mean values of two samples and the F-test for comparing variances) 1. Research Question/Problem Statement: As a part of dispute resolution during quality control and quality assurance, a highway agency engineer wants to validate a contractor’s test results concerning asphalt content. In this example, the engineer wants to compare the results

32 effective experiment Design and Data analysis in transportation research of two sets of tests: one from the contractor and one from the agency. Formally, the (null) hypothesis to be tested, Ho, is that the contractor’s tests and the agency’s tests are from the same population. In other words, the null hypothesis is that the means of the two data sets will be equal, as will the standard deviations. Notice that in the latter instance the variances are actually being compared. Test results were also compared in Example 6. In that example, the comparison was based on split samples. The same test specimens were tested by two different analysts using different equipment to see if the same results could be obtained by both. The major difference between Example 6 and Example 7 is that, in this example, the two samples are randomly selected from the same pavement section. Question/Issue Use collected data to test if two measured mean values are the same. In this instance, are two mean values of asphalt content the same? Stated in formal terms, the null and alternative hypotheses can be expressed as follows: Ho: There is no difference in asphalt content between agency and contractor test results: H m mo c a: − =( )0 Ha: There is a difference in asphalt content between agency and contractor test results: H m ma c a: − ≠( )0 2. Identification and Description of Variables: The contractor runs 12 asphalt content tests and the agency engineer runs 6 asphalt content tests over the same period of time, using the same random sampling and testing procedures. The question is whether it is likely that the tests have come from the same population based on their variability. 3. Data Collection: If the agency’s objective is simply to identify discrepancies in the testing procedures or equipment, then verification testing should be done on split samples (as in Example 6). Using split samples, the difference in the measured variable can more easily be attributed to testing procedures. A paired t-test should be used. (For more information, see NCHRP Project 20-45, Volume 2, Chapter 4, Section A, “Analysis of Variance Methodology.”) A split sample occurs when a physical sample (of whatever is being tested) is drawn and then literally split into two testable samples. On the other hand, if the agency’s objective is to identify discrepancies in the overall material, process, sampling, and testing processes, then validation testing should be done on independent samples. Notice the use of these terms. It is important to distinguish between testing to verify only the testing process (verification) versus testing to compare the overall production, sampling, and testing processes (validation). If independent samples are used, the agency test results still can be compared with contractor test results (using a simple t-test for comparing two means). If the test results are consistent, then the agency and contractor tests can be combined for contract compliance determination. 4. Specification of Analysis Technique and Data Analysis: When comparing the two data sets, it is important to compare both the means and the variances because the assumption when using the t-test requires equal variances for each of the two groups. A different test is used in each instance. The F-test provides a method for comparing the variances (the standard devia- tion squared) of two sets of data. Differences in means are assessed by the t-test. Generally, construction processes and material properties are assumed to follow a normal distribution.

examples of effective experiment Design and Data analysis in transportation research 33 In this example, a normal distribution is assumed. (The assumption of normality also can be tested, as in Example 4.) The ratios of variances follow an F-distribution, while the means of relatively small samples follow a t-distribution. Using these distributions, hypothesis tests can be conducted using the same concepts that have been discussed in prior examples. (For more information about the F-test and the t-distribution, see NCHRP Project 20-45, Volume 2, Chapter 4, Section A, “Compute the F-ratio Test Statistic.” For more information about the t-distribution, see NCHRP Project 20-45, Volume 2, Chapter 4, Section A.) For samples from the same normal population, the statistic F (the ratio of the two-sample variances) has a sampling distribution called the F-distribution. For validation and verification testing, the F-test is based on the ratio of the sample variance of the contractor’s test results (sc 2) and the sample variance of the agency’s test results (sa 2). Similarly, the t-test can be used to test whether the sample mean of the contractor’s tests, X _ c, and the agency’s tests, X _ a, came from populations with the same mean. Consider the asphalt content test results from the contractor samples and agency samples (Table 10). In this instance, the F-test is used to determine whether the variance observed for the contractor’s tests differs from the variance observed for the agency’s tests. Using the F-test Step 1. Compute the variance (s2), for each set of tests: sc 2 = 0.064 and sa 2 = 0.092. As an example, sc 2 can be calculated as: s x X n c i c i2 2 2 2 1 6 4 6 1 11 6 2 6 1 11 = −( ) − = −( ) + −( )∑ . . . . + + −( ) + −( ) =. . . . . . . 6 6 1 11 5 7 6 1 11 0 0645 2 2 Step 2. Compute F s s calc a c = = = 2 2 0 092 0 064 1 43 . . . . Contractor Samples Agency Samples 1 6.4 1 5.4 2 6.2 2 5.8 3 6.0 3 6.2 4 6.6 4 5.4 5 6.1 5 5.6 6 6.0 6 5.8 7 6.3 8 6.1 9 5.9 10 5.8 11 6.0 12 5.7 Descriptive Statistics = 6.1cX Descriptive Statistics = 5.7aX = 0.0642cs = 0.0922as = 0.25cs = 0.30as = 12cn = 6an Table 10. Asphalt content test results from independent samples.

34 effective experiment Design and Data analysis in transportation research Step 3. Determine Fcrit from the F-distribution table, making sure to use the correct degrees of freedom (df) for the numerator (the number of observations minus 1, or na - 1 = 6 - 1 = 5) and the denominator (nc - 1 = 12 - 1 = 11). For a = 0.01, Fcrit = 5.32. The critical F-value can be found from tables (see NCHRP Project 20-45, Volume 2, Appendix C, Table C-5). Read the F-value for 1 - a = 0.99, numerator and denominator degrees of freedom 5 and 11, respectively. Interpolation can be used if exact degrees of freedom are not available in the table. Alternatively, a statistical function in Microsoft Excel™ can be used to determine the F-value. Step 4. Compare the two values to determine if Fcalc < Fcrit. If Fcalc < Fcrit is true, then the variances are equal; if not, they are unequal. In this example, Fcalc (1.43) is, in fact, less than Fcrit (5.32) and, thus, there is no evidence of unequal variances. Given this result, the t-test for the case of equal variances is used to determine whether to declare that the mean of the contractor’s tests differs from the mean of the agency’s tests. Using the t-test Step 1. Compute the sample means (X _ ) for each set of tests: X _ c = 6.1 and X _ a = 5.7. Step 2. Compute the pooled variance sp 2 from the individual sample variances: s s n s n n n p c c a a c a 2 2 21 1 2 0 064 12 1 = −( )+ −( ) + − = −( )+. 0 092 6 1 12 6 2 0 0731 . . −( ) + − = Step 3. Compute the t-statistic using the following equation for equal variance: t X X s n s n c a p c p a = − + = − + = 2 2 6 1 5 7 0 0731 12 0 0731 6 . . . . 2 9. t0 005 16 2 921. , .= (For more information, see NCHRP Project 20-45, Volume 2, Appendix C, Table C-4 for A v= − =1 2 16 α and .) 5. Interpreting the Results: Given that F < Fcrit (i.e., 1.43 < 5.32), there is no reason to believe that the two sets of data have different variances. That is, they could have come from the same population. Therefore, the t-test can be used to compare the means using equal variance. Because t < tcrit (i.e., 2.9 < 2.921), the engineer does not reject the null hypothesis and, thus, assumes that the sample means are equal. The final conclusion is that it is likely that the contractor and agency test results represent the same process. In other words, with a 99% confidence level, it can be said that the agency’s test results are not different from the contrac- tor’s and therefore validate the contractor tests. 6. Conclusion and Discussion: The simple t-test can be used to validate the contractor’s test results by conducting independent sampling from the same pavement at the same time. Before conducting a formal t-test to compare the sample means, the assumption of equal variances needs to be evaluated. This can be accomplished by comparing sample variances using the F-test. The interpretation of results will be misleading if the equal variance assumption is not validated. If the variances of two populations being compared for their means are different, the mean comparison will reflect the difference between two separate populations. Finally, based on the comparison of means, one can conclude that the construction materials have consistent properties as validated by two independent sources (contractor and agency). This sort of comparison is developed further in Example 8, which illustrates tests for the equality of more than two mean values.

examples of effective experiment Design and Data analysis in transportation research 35 7. Applications in Other Areas of Transportation Research: The simple t-test can be used to compare means of two independent samples. Applications for this method in other areas of transportation research may include: • Traffic Operations – to compare average speeds at two locations along a route. – to evaluate average delay times at two intersections in an urban area. • Pavement Engineering—to investigate the difference in average performance of two pavement sections. • Maintenance—to determine the effects of two maintenance treatments on average life extension of two pavement sections. Example 8: Laboratory Testing/Instrumentation; Simple Analysis of Variance (ANOVA) Area: Laboratory testing and/or instrumentation Method of Analysis: Simple analysis of variance (ANOVA) comparing the mean values of more than two samples and using the F-test 1. Research Question/Problem Statement: An engineer wants to test and compare the com- pressive strength of five different concrete mix designs that vary in coarse aggregate type, gradation, and water/cement ratio. An experiment is conducted in a laboratory where five different concrete mixes are produced based on given specifications, and tested for com- pressive strength using the ASTM International standard procedures. In this example, the comparison involves inference on parameters from more than two populations. The purpose of the analysis, in other words, is to test whether all mix designs are similar to each other in mean compressive strength or whether some differences actually exist. ANOVA is the statistical procedure used to test the basic hypothesis illustrated in this example. Question/Issue Compare the means of more than two samples. In this instance, compare the compres- sive strengths of five concrete mix designs with different combinations of aggregates, gradation, and water/cement ratio. More formally, test the following hypotheses: Ho: There is no difference in mean compressive strength for the various (five) concrete mix types. Ha: At least one of the concrete mix types has a different compressive strength. 2. Identification and Description of Variables: In this experiment, the factor of interest (independent variable) is the concrete mix design, which has five levels based on differ- ent coarse aggregate types, gradation, and water/cement ratios (denoted by t and labeled A through E in Table 11). Compressive strength is a continuous response (dependent) variable, measured in pounds per square inch (psi) for each specimen. Because only one factor is of interest in this experiment, the statistical method illustrated is often called a one-way ANOVA or simple ANOVA. 3. Data Collection: For each of the five mix designs, three replicates each of cylinders 4 inches in diameter and 8 inches in height are made and cured for 28 days. After 28 days, all 15 specimens are tested for compressive strength using the standard ASTM International test. The compres- sive strength data and summary statistics are provided for each mix design in Table 11. In this example, resource constraints have limited the number of replicates for each mix design to

36 effective experiment Design and Data analysis in transportation research three. (For a discussion on sample size determination based on statistical power requirements, see NCHRP Project 20-45, Volume 2, Chapter 1, “Sample Size Determination.”) 4. Specification of Analysis Technique and Data Analysis: To perform a one-way ANOVA, pre- liminary calculations are carried out to compute the overall mean (y _ P), the sample means (y _ i.), and the sample variances (si 2) given the total sample size (nT = 15) as shown in Table 11. The basic strategy for ANOVA is to compare the variance between levels or groups—specifically, the variation between sample means—to the variance within levels. This comparison is used to determine if the levels explain a significant portion of the variance. (Details for perform- ing a one-way ANOVA are given in NCHRP Project 20-45, Volume 2, Chapter 4, Section A, “Analysis of Variance Methodology.”) ANOVA is based on partitioning of the total sum of squares (TSS, a measure of overall variability) into within-level and between-levels components. The TSS is defined as the sum of the squares of the differences of each observation (yij) from the overall mean (y _ P). The TSS, between-levels sum of squares (SSB), and within-level sum of squares (SSE) are computed as follows. TSS y y SSB y y ij i j i = −( ) = = −( ) ∑ .. , . .. . 2 2 4839620 90 = = −( ) = ∑ 4331513 60 508107 30 2 . . , . , i j ij i i j SSE y y∑ The next step is to compute the between-levels mean square (MSB) and within-levels mean square (MSE) based on respective degrees of freedom (df). The total degrees of freedom (dfT), between-levels degrees of freedom (dfB), and within-levels degrees of freedom (dfE) for one- way ANOVA are computed as follows: df n df t df n t T T B E T = − = − = = − = − = = − = − = 1 15 1 14 1 5 1 4 15 5 10 where nT = the total sample size and t = the total number of levels or groups. The next step of the ANOVA procedure is to compute the F-statistic. The F-statistic is the ratio of two variances: the variance due to interaction between the levels, and the variance due to differences within the levels. Under the null hypothesis, the between-levels mean square (MSB) and within-levels mean square (MSE) provide two independent estimates of the variance. If the means for different levels of mix design are truly different from each other, the MSB will tend Replicate Mix Design A B C D E 1 y11 = 5416 y21 = 5292 y31 = 4097 y41 = 5056 y51 = 4165 2 y12 = 5125 y22 = 4779 y32 = 3695 y42 = 5216 y52 = 3849 3 y13 = 4847 y23 = 4824 y33 = 4109 y43 = 5235 y53 = 4089 Mean y– 1. = 5129 y– 2. = 4965 y– 3. = 3967 y– 4. = 5169 y– 5. = 4034 Standard deviation s1 = 284.52 s2 = 284.08 s3 = 235.64 s4 = 98.32 s5 = 164.94 Overall mean y–.. = 4653 Table 11. Concrete compressive strength (psi) after 28 days.

examples of effective experiment Design and Data analysis in transportation research 37 to be larger than the MSE, such that it will be more likely to reject the null hypothesis. For this example, the calculations for MSB, MSE, and F are as follows: MSB SSB df MSE SSE df F M B E = = = = = 1082878 40 50810 70 . . SB MSE = 21 31. If there are no effects due to level, the F-statistic will tend to be smaller. If there are effects due to level, the F-statistic will tend to be larger, as is the case in this example. ANOVA computations usually are summarized in the form of a table. Table 12 summarizes the computations for this example. The final step is to determine Fcrit from the F-distribution table (e.g., NCHRP Project 20-45, Volume 2, Appendix C, Table C-5) with t - 1 (5 - 1 = 4) degrees of freedom for the numerator and nT - t (15 - 5 = 10) degrees of freedom for the denominator. For a significance level of a = 0.01, Fcrit is found (in Table C-5) to be 5.99. Given that F > Fcrit (21.31 > 5.99), the null hypothesis that all mix designs have equal compressive strength is rejected, supporting the conclusion that at least two mix designs are different from each other in their mean effect. Table 12 also shows the p-value calculated using a computer program. The p-value is the probability that a sample would result in the given statistic value if the null hypothesis were true. The p-value of 0.0000698408 is well below the chosen significance level of 0.01. 5. Interpreting the Results: The ANOVA results in rejection of the null hypothesis at a = 0.01. That is, the mean values are judged to be statistically different. However, the ANOVA result does not indicate where the difference lies. For example, does the compressive strength of mix design A differ from that of mix design C or D? To carry out such multiple mean comparisons, the analyst must control the experiment-wise error rate (EER) by employing more conservative methods such as Tukey’s test, Bonferroni’s test, or Scheffe’s test, as appropriate. (Details for ANOVA are given in NCHRP Project 20-45, Volume 2, Chapter 4, Section A, “Analysis of Variance Methodology.”) The coefficient of determination (R2) provides a rough indication of how well the statistical model fits the data. For this example, R2 is calculated as follows: R SSB TSS 2 4331513 60 4839620 90 0 90= = = . . . For this example, R2 indicates that the one-way ANOVA classification model accounts for 90% of the total variation in the data. In the controlled laboratory experiment demonstrated in this example, R2 = 0.90 indicates a fairly acceptable fit of the statistical model to the data. 6. Conclusion and Discussion: This example illustrates a simple one-way ANOVA where infer- ence regarding parameters (mean values) from more than two populations or treatments was Source Sum of Squares (SS) Degrees of Freedom (df) Mean Square (MS) F Probability > F (Significance) Between 4331513.60 4 1082878.40 21.31 0.0000698408 Within 508107.30 10 50810.70 Total 4839620.90 14 Table 12. ANOVA results.

38 effective experiment Design and Data analysis in transportation research desired. The focus of computations was the construction of the ANOVA table. Before pro- ceeding with ANOVA, however, an analyst must verify that the assumptions of common vari- ance and data normality are satisfied within each group/level. The results do not establish the cause of difference in compressive strength between mix designs in any way. The experimental setup and analytical procedure shown in this example may be used to test other properties of mix designs such as flexure strength. If another factor (for example, water/cement ratio with levels low or high) is added to the analysis, the classification will become a two-way ANOVA. (In this report, two-way ANOVA is demonstrated in Example 11.) Notice that the equations shown in Example 8 may only be used for one-way ANOVA for balanced designs, meaning that in this experiment there are equal numbers of replicates for each level within a factor. (For a discussion of computations on unbalanced designs and multifactor designs, see NCHRP Project 20-45.) 7. Applications in Other Areas of Transportation Research: Examples of applications of one-way ANOVA in other areas of transportation research include: • Traffic Operations—to determine the effect of various traffic calming devices on average speeds in residential areas. • Traffic Operations/Safety—to study the effect of weather conditions on accidents in a given time period. • Work Zones—to compare the effect of different placements of work zone signs on reduction in highway speeds at some downstream point. • Materials—to investigate the effect of recycled aggregates on compressive and flexural strength of concrete. Example 9: Materials; Simple Analysis of Variance (ANOVA) Area: Materials Method of Analysis: Simple analysis of variance (ANOVA) comparing more than two mean values and using the F-test for equality of means 1. Research Question/Problem Statement: To illustrate how increasingly detailed analysis may be appropriate, Example 9 is an extension of the two-sample comparison presented in Exam- ple 7. As a part of dispute resolution during quality control and quality assurance, let’s say the highway agency engineer from Example 7 decides to reconfirm the contractor’s test results for asphalt content. The agency hires an independent consultant to verify both the contractor- and agency-measured asphalt contents. It now becomes necessary to compare more than two mean values. A simple one-way analysis of variance (ANOVA) can be used to analyze the asphalt contents measured by three different parties. Question/Issue Extend a comparison of two mean values to compare three (or more) mean values. Specifically, use data collected by several (>2) different parties to see if the results (mean values) are the same. Formally, test the following null (Ho) and alternative (Ha) hypotheses, which can be stated as follows: Ho: There is no difference in asphalt content among three different parties: H m m mo contractor agency: = =( )consultant Ha: At least one of the parties has a different measured asphalt content.

examples of effective experiment Design and Data analysis in transportation research 39 2. Identification and Description of Variables: The independent consultant runs 12 additional asphalt content tests by taking independent samples from the same pavement section as the agency and contractor. The question is whether it is likely that the tests came from the same population, based on their variability. 3. Data Collection: The descriptive statistics (mean, standard deviation, and sample size) for the asphalt content data collected by the three parties are shown in Table 13. Notice that 12 measurements each have been taken by the contractor and the independent consultant, while the agency has only taken six measurements. The data for the contractor and the agency are the same as presented in Example 7. For brevity, the consultant’s raw observations are not repeated here. The mean value and standard deviation for the consultant’s data are calculated using the same formulas and equations that were used in Example 7. 4. Specification of Analysis Technique and Data Analysis: The agency engineer can use one-way ANOVA to resolve this question. (Details for one-way ANOVA are available in NCHRP Project 20-45, Volume 2, Chapter 4, Section A, “Analysis of Variance Methodology.”) The objective of the ANOVA is to determine whether the variance observed in the depen- dent variable (in this case, asphalt content) is due to the differences among the samples (different from one party to another) or due to the differences within the samples. ANOVA is basically an extension of two-sample comparisons to cases when three or more samples are being compared. More formally, the technician is testing to see whether the between- sample variability is large relative to the within-sample variability, as stated in the formal hypothesis. This type of comparison also may be referred to as between-groups versus within-groups variance. Rejection of the null hypothesis (that the mean values are the same) gives the engineer some information concerning differences among the population means; however, it does not indicate which means actually differ from each other. Rejection of the null hypothesis tells the engineer that differences exist, but it does not specify that X _ 1 differs from X _ 2 or from X _ 3. To control the experiment-wise error rate (EER) for multiple mean comparisons, a con- servative test—Tukey’s procedure for unplanned comparisons—can be used for unplanned comparisons. (Information about Tukey’s procedure can be found in almost any good statistics textbook, such as those by Freund and Wilson [2003] and Kutner et al. [2005].) The F-statistic calculated for determining the effect of who (agency, contractor, or consultant) measured Party Type Asphalt Content Percent Contractor 1 1 1 X s n = 6.1 = 0.254 = 12 Agency 2 2 2 X s n = 5.7 = 0.303 = 6 Consultant 3 3 3 X s n = 5.12 = 0.186 = 12 Table 13. Asphalt content data summary.

40 effective experiment Design and Data analysis in transportation research the asphalt content is given in Table 14. (See Example 8 for a more detailed discussion of the calculations necessary to create Table 14.) Although the ANOVA results reveal whether there are overall differences, it is always good practice to visually examine the data. For example, Figure 9 shows the mean and associated 95% confidence intervals (CI) of the mean asphalt content measured by each of the three parties involved in the testing. 5. Interpreting the Results: A simple one-way ANOVA is conducted to determine whether there is a difference in mean asphalt content as measured by the three different parties. The analysis shows that the F-statistic is significant (p-value < 0.05), meaning that at least two of the means are significantly different from each other. The engineer can use Tukey’s procedure for com- parisons of multiple means, or he or she can observe the plotted 95% confidence intervals to figure out which means are actually (and significantly) different from each other (see Figure 9). Because the confidence intervals overlap, the results show that the asphalt content measured by the contractor and the agency are somewhat different. (These same conclusions were obtained in Example 7.) However, the mean asphalt content obtained by the consultant is significantly different from (and lower than) that obtained by both of the other parties. This is evident because the confidence interval for the consultant doesn’t overlap with the confidence interval of either of the other two parties. Source Sum of Squares (SS) Degrees of Freedom (df) Mean Square (MS) F Significance Between groups 5.6 2 2.8 49.1 0.000 Within groups 1.5 27 0.06 Total 7.2 29 Table 14. ANOVA results. Figure 9. Mean and confidence intervals for asphalt content data.

examples of effective experiment Design and Data analysis in transportation research 41 6. Conclusion and Discussion: This example uses a simple one-way ANOVA to compare the mean values of three sets of results using data drawn from the same test section. The error bar plots for data from the three different parties visually illustrate the statistical differences in the multiple means. However, the F-test for multiple means should be used to formally test the hypothesis of the equality of means. The interpretation of results will be misleading if the variances of populations being compared for their mean difference are not equal. Based on the comparison of the three means, it can be concluded that the construction material in this example may not have consistent properties, as indicated by the results from the independent consultant. 7. Applications in Other Areas of Transportation Research: Simple one-way ANOVA is often used when more than two means must be compared. Examples of applications in other areas of transportation research include: • Traffic Safety/Operations—to evaluate the effect of intersection type on the average number of accidents per month. Three or more types of intersections (e.g., signalized, non-signalized, and rotary) could be selected for study in an urban area having similar traffic volumes and vehicle mix. • Pavement Engineering – to investigate the effect of hot-mix asphalt (HMA) layer thickness on fatigue cracking after 20 years of service life. Three HMA layer thicknesses (5 inches, 6 inches, and 7 inches) are to be involved in this study, and other factors (i.e., traffic, climate, and subbase/base thicknesses and subgrade types) need to be similar. – to determine the effect of climatic conditions on rutting performance of flexible pavements. Three or more climatic conditions (e.g., wet-freeze, wet-no-freeze, dry-freeze, and dry-no-freeze) need to be considered while other factors (i.e., traffic, HMA, and subbase/ base thicknesses and subgrade types) need to be similar. Example 10: Pavements; Simple Analysis of Variance (ANOVA) Area: Pavements Method of Analysis: Simple analysis of variance (ANOVA) comparing the mean values of more than two samples and using the F-test 1. Research Question/Problem Statement: The aggregate coefficient of thermal expansion (CTE) in Portland cement concrete (PCC) is a critical factor affecting thermal behavior of PCC slabs in concrete pavements. In addition, the interaction between slab curling (caused by the thermal gradient) and axle loads is assumed to be a critical factor for concrete pavement performance in terms of cracking. To verify the effect of aggregate CTE on slab cracking, a pavement engineer wants to conduct a simple observational study by collecting field pave- ment performance data on three different types of pavement. For this example, three types of aggregate (limestone, dolomite, and gravel) are being used in concrete pavement construction and yield the following CTEs: • 4 in./in. per °F • 5 in./in. per °F • 6.5 in./in. per °F It is necessary to compare more than two mean values. A simple one-way ANOVA is used to analyze the observed slab cracking performance by the three different concrete mixes with different aggregate types based on geology (limestone, dolomite, and gravel). All other factors that might cause variation in cracking are assumed to be held constant.

42 effective experiment Design and Data analysis in transportation research 2. Identification and Description of Variables: The engineer identifies 1-mile sections of uni- form pavement within the state highway network with similar attributes (aggregate type, slab thickness, joint spacing, traffic, and climate). Field performance, in terms of the observed percentage of slab cracked (“% slab cracked,” i.e., how cracked is each slab) for each pavement section after about 20 years of service, is considered in the analysis. The available pavement data are grouped (stratified) based on the aggregate type (CTE value). The % slab cracked after 20 years is the dependent variable, while CTE of aggregates is the independent variable. The question is whether pavement sections having different types of aggregate (CTE values) exhibit similar performance based on their variability. 3. Data Collection: From the data stratified by CTE, the engineer randomly selects nine pave- ment sections within each CTE category (i.e., 4, 5, and 6.5 in./in. per °F). The sample size is based on the statistical power (1-b) requirements. (For a discussion on sample size determina- tion based on statistical power requirements, see NCHRP Project 20-45, Volume 2, Chapter 1, “Sample Size Determination.”) The descriptive statistics for the data, organized by three CTE categories, are shown in Table 15. The engineer considers pavement performance data for 9 pavement sections in each CTE category. 4. Specification of Analysis Technique and Data Analysis: Because the engineer is concerned with the comparison of more than two mean values, the easiest way to make the statistical comparison is to perform a one-way ANOVA (see NCHRP Project 20-45, Volume 2, Chapter 4). The comparison will help to determine whether the between-section variability is large relative to the within-section variability. More formally, the following hypotheses are tested: HO: All mean values are equal (i.e., m1 = m2 = m3). HA: At least one of the means is different from the rest. Although rejection of the null hypothesis gives the engineer some information concerning difference among the population means, it doesn’t tell the engineer anything about how the means differ from each other. For example, does m1 differ from m2 or m3? To control the experiment-wise error rate (EER) for multiple mean comparisons, a conservative test— Tukey’s procedure for unplanned comparisons—can be used. (Information about Tukey’s procedure can be found in almost any good statistics textbook, such as those by Freund and Wilson [2003] and Kutner et al. [2005].)The F-statistic calculated for determining the effect of CTE on % slab cracked after 20 years is shown in Table 16. Question/Issue Compare the means of more than two samples. Specifically, is the cracking perfor- mance of concrete pavements designed using more than two different types of aggregates the same? Stated a bit differently, is the performance of three different types of concrete pavement statistically different (are the mean performance measures different)? CTE (in./in. per oF) % Slab Cracked After 20 Years 4 1 1 137, 4.8, 9X s n= = = 5 2 2 253.7, 6.1, 9X s n= = = 6.5 3 3 372.5, 6.3, 9X s n= = = Table 15. Pavement performance data.

examples of effective experiment Design and Data analysis in transportation research 43 The data in Table 16 have been produced by considering the original data and following the procedures presented in earlier examples. The emphasis in this example is on understanding what the table of results provides the researcher. Also in this example, the test for homogeneity of variances (Levene test) shows no significant difference among the standard deviations of % slab cracked for different CTE values. Figure 10 presents the mean and associated 95% confi- dence intervals of the average % slab cracked (also called the mean and error bars) measured for the three CTE categories considered. 5. Interpreting the Results: A simple one-way ANOVA is conducted to determine if there is a difference among the mean values for % slab cracked for different CTE values. The analysis shows that the F-statistic is significant (p-value < 0.05), meaning that at least two of the means are statistically significantly different from each other. To gain more insight, the engineer can use Tukey’s procedure to specifically compare the mean values, or the engineer may simply observe the plotted 95% confidence intervals to ascertain which means are significantly different from each other (see Figure 10). The plotted results show that the mean % slab cracked varies significantly for different CTE values—there is no overlap between the different mean/error bars. Figure 10 also shows that the mean % slab cracked is significantly higher for pavement sections having a higher CTE value. (For more information about Tukey’s procedure, see NCHRP Project 20-45, Volume 2, Chapter 4.) 6. Conclusion and Discussion: In this example, simple one-way ANOVA is used to assess the effect of CTE on cracking performance of rigid pavements. The F-test for multiple means is used to formally test the (null) hypothesis of mean equality. The confidence interval plots for data from pavements having three different CTE values visually illustrate the statistical differ- ences in the three means. The interpretation of results will be misleading if the variances of Source Sum of Squares (SS) Degrees of Freedom (df) Mean Square (MS) F Significance Between groups 5652.7 2 0.0002826.3 84.1 Within groups 806.9 24 33.6 Total 6459.6 26 Table 16. ANOVA results. Figure 10. Error bars for % slab cracked with different CTE.

44 effective experiment Design and Data analysis in transportation research populations being compared for their mean difference are not equal or if a proper multiple mean comparisons procedure is not adopted. Based on the comparison of the three means in this example, the engineer can conclude that the pavement slabs having aggregates with a higher CTE value will exhibit more cracking than those with lower CTE values, given that all other variables (e.g., climate effects) remain constant. 7. Applications in Other Areas of Transportation Research: Simple one-way ANOVA is widely used and can be employed whenever multiple means within a factor are to be compared with one another. Potential applications in other areas of transportation research include: • Traffic Operations—to evaluate the effect of commuting time on level of service (LOS) of an urban highway. Mean travel times for three periods (e.g., morning, afternoon, and evening) could be selected for specified highway sections to collect the traffic volume and headway data in all lanes. • Traffic Safety—to determine the effect of shoulder width on accident rates on rural highways. More than two shoulder widths (e.g., 0 feet, 6 feet, 9 feet, and 12 feet) should be selected in this study. • Pavement Engineering—to investigate the impact of air void content on flexible pavement fatigue performance. Pavement sections having three or more air void contents (e.g., 3%, 5%, and 7%) in the surface HMA layer could be selected to compare their average fatigue cracking performance after the same period of service (e.g., 15 years). • Materials—to study the effect of aggregate gradation on the rutting performance of flexible pavements. Three types of aggregate gradations (fine, intermediate, and coarse) could be adopted in the laboratory to make different HMA mix samples. Performance testing could be conducted in the laboratory to measure rut depths for a given number of load cycles. Example 11: Pavements; Factorial Design (ANOVA Approach) Area: Pavements Method of Analysis: Factorial design (an ANOVA approach used to explore the effects of varying more than one independent variable) 1. Research Question/Problem Statement: Extending the information from Example 10 (a simple ANOVA example for pavements), the pavement engineer has verified that the coefficient of thermal expansion (CTE) in Portland cement concrete (PCC) is a critical factor affecting thermal behavior of PCC slabs in concrete pavements and significantly affects concrete pave- ment performance in terms of cracking. The engineer now wants to investigate the effects of another factor, joint spacing (JS), in addition to CTE. To study the combined effects of PCC CTE and JS on slab cracking, the engineer needs to conduct a factorial design study by collect- ing field pavement performance data. As before, three CTEs will be considered: • 4 in./in. per °F, • 5 in./in. per °F, and • 6.5 in./in. per °F. Now, three different joint spacings (12 ft, 16 ft, and 20 ft) also will be considered. For this example, it is necessary to compare multiple means within each factor (main effects) and the interaction between the two factors (interactive effects). The statistical technique involved is called a multifactorial two-way ANOVA. 2. Identification and Description of Variables: The engineer identifies uniform 1-mile pavement sections within the state highway network with similar attributes (e.g., slab thickness, traffic, and climate). The field performance, in terms of observed percentage of each slab cracked (% slab cracked) after about 20 years of service for each pavement section, is considered the

examples of effective experiment Design and Data analysis in transportation research 45 dependent (or response) variable in the analysis. The available pavement data are stratified based on CTE and JS. CTE and JS are considered the independent variables. The question is whether pavement sections having different CTE and JS exhibit similar performance based on their variability. Question/Issue Use collected data to determine the effects of varying more than one independent variable on some measured outcome. In this example, compare the cracking perfor- mance of concrete pavements considering two independent variables: (1) coefficients of thermal expansion (CTE) as measured using more than two types of aggregate and (2) differing joint spacing (JS). More formally, the hypotheses can be stated as follows: Ho : ai = 0, No difference in % slabs cracked for different CTE values. Ho : gj = 0, No difference in % slabs cracked for different JS values. Ho : (ag)ij = 0, for all i and j, No difference in % slabs cracked for different CTE and JS combinations. 3. Data Collection: The descriptive statistics for % slab cracked data by three CTE and three JS categories are shown in Table 17. From the data stratified by CTE and JS, the engineer has randomly selected three pavement sections within each of nine combinations of CTE values. (In other words, for each of the nine pavement sections from Example 10, the engineer has selected three JS.) 4. Specification of Analysis Technique and Data Analysis: The engineer can use two-way ANOVA test statistics to determine whether the between-section variability is large relative to the within-section variability for each factor to test the following null hypotheses: • Ho : ai = 0 • Ho : gj = 0 • Ho : (ag)ij = 0 As mentioned before, although rejection of the null hypothesis does give the engineer some information concerning differences among the population means (i.e., there are differences among them), it does not clarify which means differ from each other. For example, does µ1 differ from µ2 or µ3? To control the experiment-wise error rate (EER) for the comparison of multiple means, a conservative test—Tukey’s procedure for an unplanned comparison—can be used. (Information about two-way ANOVA is available in NCHRP Project 20-45, Volume 2, CTE (in/in per oF) Marginal µ & σ 4 5 6.5 Joint spacing (ft) 12 1,1 = 32.4 s1,1 = 0.1 1,2 = 46.8 s1,2 = 1.8 1,3 = 65.3 s 1,3 = 3.2 1,. = 48.2 s1,. = 14.4 16 2,1 = 36.0 s2,1 = 2.4 2,2 = 54 s2,2 = 2.9 2,3 = 73 s2,3 = 1.1 2,. = 54.3 s2,. = 16.1 20 3,1 = 42.7 s3,1 = 2.4 3,2 = 60.3 s3,2 = 0.5 3,3 = 79.1 s3,3 = 2.0 3,. = 60.7 s3,. = 15.9 Marginal µ & σ .,1 = 37.0 x– x– x– x– x– x– x– x– x– x– x– x– x– x– x– x– s.,1 = 4.8 .,2 = 53.7 s.,2 = 6.1 .,3 = 72.5 s.,3 = 6.3 .,. = 54.4 s.,. = 15.8 Note: n = 3 in each cell; values are cell means and standard deviations. Table 17. Summary of cracking data.

46 effective experiment Design and Data analysis in transportation research Chapter 4. Information about Tukey’s procedure can be found in almost any good statistics textbook, such as those by Freund and Wilson [2003] and Kutner et al. [2005].) The results of the two-way ANOVA are shown in Table 18. From the first line it can be seen that both of the main effects, CTE and JS, are significant in explaining cracking behavior (i.e., both p-values < 0.05). However, the interaction (CTE × JS) is not significant (i.e., the p-value is 0.999, much greater than 0.05). Also, the test for homogeneity of variances (Levene statistic) shows that there is no significant difference among the standard deviations of % slab cracked for different CTE and JS values. Figure 11 illustrates the main and interactive effects of CTE and JS on % slabs cracked. 5. Interpreting the Results: A two-way (multifactorial) ANOVA is conducted to determine if difference exists among the mean values for “% slab cracked” for different CTE and JS values. The analysis shows that the main effects of both CTE and JS are significant, while the inter- action effect is insignificant (p-value > 0.05). These results show that when CTE and JS are considered jointly, they significantly impact the slab cracking separately. Given these results, the conclusions from the results will be based on the main effects alone without considering interaction effects. In fact, if the interaction effect had been significant, the conclusions would be based on them. To gain more insight, the engineer can use Tukey’s procedure to compare specific multiple means within each factor, or the engineer can simply observe the plotted means in Figure 11 to ascertain which means are significantly different from each other. The plotted results show that the mean % slab cracked varies significantly for different CTE and JS values; that is, the CTE seems to be more influential than JS. All lines are almost parallel to Source Sum of Squares (SS) Degrees of Freedom (df) Mean Square (MS) F Significance CTE 5677.74 2 2838.87 657.16 0.000 JS 703.26 2 351.63 81.40 0.000 CTE × JS 0.12 4 0.03 0.007 0.999 Residual/error 77.76 18 4.32 Total 6458.88 26 Table 18. ANOVA results. M ea n % s la bs c ra ck ed 6.55.04.0 75 70 65 60 55 50 45 40 35 201612 CTE JS Main Effects Plot (data means) for Cracking Joint Spacing (ft) M ea n % s la bs c ra ck ed 201612 80 70 60 50 40 30 CTE 6.5 4.0 5.0 Interaction Plot (data means) for Cracking Figure 11. Main and interaction effects of CTE and JS on slab cracking.

examples of effective experiment Design and Data analysis in transportation research 47 each other when plotted for both factors together, showing no interactive effects between the levels of two factors. 6. Conclusion and Discussion: The two-way ANOVA can be used to verify the combined effects of CTE and JS on cracking performance of rigid pavements. The marginal mean plot for cracking having three different CTE and JS levels visually illustrates the differences in the multiple means. The plot of cell means for cracking within the levels of each factor can indicate the presence of interactive effect between two factors (in this example, CTE and JS). However, the F-test for multiple means should be used to formally test the hypothesis of mean equality. Finally, based on the comparison of three means within each factor (CTE and JS), the engineer can conclude that the pavement slabs having aggregates with higher CTE and JS values will exhibit more cracking than those with lower CTE and JS values. In this example, the effect of CTE on concrete pavement cracking seems to be more critical than that of JS. 7. Applications in Other Areas of Transportation Research: Multifactorial designs can be used when more than one factor is considered in a study. Possible applications of these methods can extend to all transportation-related areas, including: • Pavement Engineering – to determine the effects of base type and base thickness on pavement performance of flexible pavements. Two or more levels can be considered within each factor; for exam- ple, two base types (aggregate and asphalt-treated bases) and three base thicknesses (8 inches, 12 inches, and 18 inches). – to investigate the impact of pavement surface conditions and vehicle type on fuel con- sumption. The researcher can select pavement sections with three levels of ride quality (smooth, rough, and very rough) and three types of vehicles (cars, vans, and trucks). The fuel consumptions can be measured for each vehicle type on all surface conditions to determine their impact. • Materials – to study the effects of aggregate gradation and surface on tensile strength of hot-mix asphalt (HMA). The engineer can evaluate two levels of gradation (fine and coarse) and two types of aggregate surfaces (smooth and rough). The samples can be prepared for all the combinations of aggregate gradations and surfaces for determination of tensile strength in the laboratory. – to compare the impact of curing and cement types on the compressive strength of concrete mixture. The engineer can design concrete mixes in laboratory utilizing two cement types (Type I & Type III). The concrete samples can be cured in three different ways for 24 hours and 7 days (normal curing, water bath, and room temperature). Example 12: Work Zones; Simple Before-and-After Comparisons Area: Work zones Method of Analysis: Simple before-and-after comparisons (exploring the effect of some treat- ment before it is applied versus after it is applied) 1. Research Question/Problem Statement: The crash rate in work zones has been found to be higher than the crash rate on the same roads when a work zone is not present. For this reason, the speed limit in construction zones often is set lower than the prevailing non-work-zone speed limit. The state DOT decides to implement photo-radar speed enforcement in a work zone to determine if this speed-enforcement technique reduces the average speed of free- flowing vehicles in the traffic stream. They measure the speeds of a sample of free-flowing vehicles prior to installing the photo-radar speed-enforcement equipment in a work zone and

48 effective experiment Design and Data analysis in transportation research then measure the speeds of free-flowing vehicles at the same location after implementing the photo-radar system. Question/Issue Use collected data to determine whether a difference exists between results before and after some treatment is applied. For this example, does a photo-radar speed- enforcement system reduce the speed of free-flowing vehicles in a work zone, and, if so, is the reduction statistically significant? 2. Identification and Description of Variables: The variable to be analyzed is the mean speed of vehicles before and after the implementation of a photo-radar speed-enforcement system in a work zone. 3. Data Collection: The speeds of individual free-flowing vehicles are recorded for 30 minutes on a Tuesday between 10:00 a.m. and 10:30 a.m. before installing the photo-radar system. After the system is installed, the speeds of individual free-flowing vehicles are recorded for 30 minutes on a Tuesday between 10:00 a.m. and 10:30 a.m. The before sample contains 120 observations and the after sample contains 100 observations. 4. Specification of Analysis Technique and Data Analysis: A test of the significance of the difference between two means requires a statement of the hypothesis to be tested (Ho) and a statement of the alternate hypothesis (H1). In this example, these hypotheses can be stated as follows: Ho: There is no difference in the mean speed of free-flowing vehicles before and after the photo-radar speed-enforcement system is displayed. H1: There is a difference in the mean speed of free-flowing vehicles before and after the photo-radar speed-enforcement system is displayed. Because these two samples are independent, a simple t-test is appropriate to test the stated hypotheses. This test requires the following procedure: Step 1. Compute the mean speed (x _ ) for the before sample (x _ b) and the after sample (x _ a) using the following equation: x x n n ni i i n i b a= = = = ∑ 1 120 100; and Results: x _ b = 53.1 mph and x _ a = 50.5 mph. Step 2. Compute the variance (S2) for each sample using the following equation: S x x n i i i n 2 2 1 1 = −( ) − − ∑ where na = 100; x _ a= 50.5 mph; nb = 120; and x _ b = 53.1 mph Results: S x x n b b b b 2 2 1 12 06= −( ) − =∑ . and S x x n a a a a 2 2 1 12 97= −( ) − =∑ . . Step 3. Compute the pooled variance of the two samples using the following equation: S x x x x n n p a a b b b a 2 2 2 2 = −( ) + −( ) + − ∑∑ Results: S2p = 12.472 and Sp = 3.532.

examples of effective experiment Design and Data analysis in transportation research 49 Step 4. Compute the t-statistic using the following equation: t x x S n n n n b a p a b a b = − + Result: t = − ( )( ) + = 53 1 50 5 3 532 100 120 100 120 5 43 . . . . . 5. Interpreting the Results: The results of the sample t-test are obtained by comparing the value of the calculated t-statistic (5.43 in this example) with the value of the t-statistic for the level of confidence desired. For a level of confidence of 95%, the t-statistic must be greater than 1.96 to reject the null hypotheses (Ho) that the use of a photo-radar speed-enforcement sys- tem does not change the speed of free-flowing vehicles. (For more information, see NCHRP Project 20-45, Volume 2, Appendix C, Table C-4.) 6. Conclusion and Discussion: The sample problem illustrates the use of a statistical test to determine whether the difference in the value of the variable of interest between the before conditions and the after conditions is statistically significant. The before condition is without photo-radar speed enforcement; and the after condition is with photo-radar speed enforcement. In this sample problem, the computed t-statistic (5.43) is greater than the critical t-statistic (1.96), so the null hypothesis is rejected. This means the change in the speed of free-flowing vehicles when the photo-radar speed-enforcement system is used is statistically significant. The assumption is made that all other factors that would affect the speed of free-flowing vehicles (e.g., traffic mix, weather, or construction activity) are the same in the before-and-after conditions. This test is robust if the normality assumption does not hold completely; however, it should be checked using box plots. For significant departures from normality and variance equality assumptions, non-parametric tests must be conducted. (For more information, see NCHRP Project 20-45, Volume 2, Chapter 6, Section C and also Example 21). The reliability of the results in this example could be improved by using a control group. As the example has been constructed, there is an assumption that the only thing that changed at this site was the use of photo-radar speed enforcement; that is, it is assumed that all observed differences are attributable to the use of the photo-radar. If other factors—even something as simple as a general decrease in vehicle speeds in the area—might have impacted speed changes, the effect of the photo-radar speed enforcement would have to be adjusted for those other factors. Measurements taken at a control site (ideally identical to the experiment site) during the same time periods could be used to detect background changes and then to adjust the photo-radar effects. Such a situation is explored in Example 13. 7. Applications in Other Areas in Transportation: The before-and-after comparison can be used whenever two independent samples of data are (or can be assumed to be) normally distributed with equal variance. Applications of before-and-after comparison in other areas of transportation research may include: • Traffic Operations – to compare the average delay to vehicles approaching a signalized intersection when a fixed time signal is changed to an actuated signal or a traffic-adaptive signal. – to compare the average number of vehicles entering and leaving a driveway when access is changed from full access to right-in, right-out only. • Traffic Safety – to compare the average number of crashes on a section of road before and after the road is resurfaced. – to compare the average number of speeding citations issued per day when a stationary operation is changed to a mobile operation. • Maintenance—to compare the average number of citizen complaints per day when a change is made in the snow plowing policy.

50 effective experiment Design and Data analysis in transportation research Example 13: Traffic Safety; Complex Before-and-After Comparisons and Controls Area: Traffic safety Method of Analysis: Complex before-and-after comparisons using control groups (examining the effect of some treatment or application with consideration of other factors that may also have an effect) 1. Research Question/Problem Statement: A state safety engineer wants to estimate the effec- tiveness of fluorescent orange warning signs as compared to standard orange signs in work zones on freeways and other multilane highways. Drivers can see fluorescent signs from a longer distance than standard signs, especially in low-visibility conditions, and the extra cost of the fluorescent material is not too high. Work-zone safety is a perennial concern, especially on freeways and multilane highways where speeds and traffic volumes are high. Question/Issue How can background effects be separated from the effects of a treatment or application? Compared to standard orange signs, do fluorescent orange warning signs increase safety in work zones on freeways and multilane highways? 2. Identification and Description of Variables: The engineer quickly concludes that there is a need to collect and analyze safety surrogate measures (e.g., traffic conflicts and late lane changes) rather than collision data. It would take a long time and require experimentation at many work zones before a large sample of collision data could be ready for analysis on this question. Surrogate measures relate to collisions, but they are much more numerous and it is easier to collect a large sample of them in a short time. For a study of traffic safety, surrogate measures might include near-collisions (traffic conflicts), vehicle speeds, or locations of lane changes. In this example, the engineer chooses to use the location of the lane-change maneuver made by drivers in a lane to be closed entering a work zone. This particular surrogate safety measure is a measure of effectiveness (MOE). The hypothesis is that the farther downstream at which a driver makes a lane change out of a lane to be closed—when the highway is still below capacity—the safer the work zone. 3. Data Collection: The engineer establishes site selection criteria and begins examining all active work zones on freeways and multilane highways in the state for possible inclusion in the study. The site selection criteria include items such as an active work zone, a cooperative contractor, no interchanges within the approach area, and the desired lane geometry. Seven work zones meet the criteria and are included in the study. The engineer decides to use a before-and-after (sometimes designated B/A or b/a) experiment design with randomly selected control sites. The latter are sites in the same population as the treatment sites; that is, they meet the same selection criteria but are untreated (i.e., standard warning signs are employed, not the fluorescent orange signs). This is a strong experiment design because it minimizes three common types of bias in experiments: history, maturation, and regression to the mean. History bias exists when changes (e.g., new laws or large weather events) happen at about the same time as the treatment in an experiment, so that the engineer or analyst cannot separate the effect of the treatment from the effects of the other events. Maturation bias exists when gradual changes occur throughout an extended experiment period and cannot be separated from the effects of the treatment. Examples of maturation bias might involve changes like the aging of driver populations or new vehicles with more air bags. History and maturation biases are referred to as specification errors and are described in more detail in NCHRP Project 20-45, Volume 2,

examples of effective experiment Design and Data analysis in transportation research 51 Chapter 1, in the section “Quasi-Experiments.” Regression-to-the-mean bias exists when sites with the highest MOE levels in the before time period are treated. If the MOE level falls in the after period, the analyst can never be sure how much of the fall was due to the treatment and how much was due to natural fluctuations in the values of the MOE back toward its usual mean value. A before-and-after study with randomly selected control sites minimizes these biases because their effects are expected to apply just as much to the treatment sites as to the control sites. In this example, the engineer randomly selects four of the seven work zones to receive fluorescent orange signs. The other three randomly selected work zones received standard orange signs and are the control sites. After the signs have been in place for a few weeks (a common tactic in before-and-after studies to allow regular drivers to get used to the change), the engineer collects data at all seven sites. The location of each vehicle’s lane-change maneuver out of the lane to be closed is measured from video tape recorded for several hours at each site. Table 19 shows the lane-change data at the midpoint between the first warning sign and beginning of the taper. Notice that the same number of vehicles is observed in the before-and- after periods for each type of site. 4. Specification of Analysis Technique and Data Analysis: Depending on their format, data from a before-and-after experiment with control sites may be analyzed several ways. The data in the table lend themselves to analysis with a chi-square test to see whether the distributions between the before-and-after conditions are the same at both the treatment and control sites. (For more information about chi-square testing, see NCHRP Project 20-45, Volume 2, Chapter 6, Section E, “Chi-Square Test for Independence.”) To perform the chi-square test on the data for Example 13, the engineer first computes the expected value in each cell. For the cell corresponding to the before time period for control sites, this value is computed as the row total (3361) times the column total (2738) divided by the grand total (6714): 3361 2738 6714 1371 = vehicles The engineer next computes the chi-square value for each cell using the following equation: χi i i i O E E 2 2 = −( ) where Oi is the number of actual observations in cell i and Ei is the expected number of observations in cell i. For example, the chi-square value in the cell corresponding to the before time period for control sites is (1262 - 1371)2 / 1371 = 8.6. The engineer then sums the chi-square values from all four cells to get 29.1. That sum is then compared to the critical chi-square value for the significance level of 0.025 with 1 degree of freedom (degrees of freedom = number of rows - 1 * number of columns - 1), which is shown on a standard chi-square distribution table to be 5.02 (see NCHRP Project 20-45, Volume 2, Appendix C, Table C-2.) A significance level of 0.025 is not uncommon in such experiments (although 0.05 is a general default value), but it is a standard that is difficult but not impossible to meet. Time Period Number of Vehicles Observed in Lane to be Closed at Midpoint Control Treatment Total Before 1262 2099 3361 After 1476 1877 3353 Total 2738 3976 6714 Table 19. Lane-change data for before-and-after comparison using controls.

52 effective experiment Design and Data analysis in transportation research 5. Interpreting the Results: Because the calculated chi-square value is greater than the critical chi-square value, the engineer concludes that there is a statistically significant difference in the number of vehicles in the lane to be closed at the midpoint between the before-and-after time periods for the treatment sites relative to what would be expected based on the control sites. In other words, there is a difference that is due to the treatment. 6. Conclusion and Discussion: The experiment results show that fluorescent orange signs in work zone approaches like those tested would likely have a safety benefit. Although the engi- neer cannot reasonably estimate the number of collisions that would be avoided by using this treatment, the before-and-after study with control using a safety surrogate measure makes it clear that some collisions will be avoided. The strength of the experiment design with randomly selected control sites means that agencies can have confidence in the results. The consequences of an error in an analysis like this that results in the wrong conclusion can be devastating. If the error leads an agency to use a safety measure more than it should, precious safety funds will be wasted that could be put to better use. If the error leads an agency to use the safety measure less often than it should, money will be spent on measures that do not prevent as many collisions. With safety funds in such short supply, solid analyses that lead to effective decisions on countermeasure deployment are of great importance. A before-and-after experiment with control is difficult to arrange in practice. Such an experiment is practically impossible using collision data, because that would mean leaving some higher collision sites untreated during the experiment. Such experiments are more plausible using surrogate measures like the one described in this example. 7. Applications in Other Areas of Transportation Research: Before-and-after experiments with randomly selected control sites are difficult to arrange in transportation safety and other areas of transportation research. The instinct to apply treatments to the worst sites, rather than randomly—as this method requires—is difficult to overcome. Despite the difficulties, such experiments are sometimes performed in: • Traffic Operations—to test traffic control strategies at a number of different intersections. • Pavement Engineering—to compare new pavement designs and maintenance processes to current designs and practice. • Materials—to compare new materials, mixes, or processes to standard mixtures or processes. Example 14: Work Zones; Trend Analysis Area: Work zones Method of Analysis: Trend analysis (examining, describing, and modeling how something changes over time) 1. Research Question/Problem Statement: Measurements conducted over time often reveal patterns of change called trends. A model may be used to predict some future measurement, or the relative success of a different treatment or policy may be assessed. For example, work/ construction zone safety has been a concern for highway officials, engineers, and planners for many years. Is there a pattern of change? Question/Issue Can a linear model represent change over time? In this particular example, is there a trend over time for motor vehicle crashes in work zones? The problem is to predict values of crash frequency at specific points in time. Although the question is simple, the statistical modeling becomes sophisticated very quickly.

examples of effective experiment Design and Data analysis in transportation research 53 2. Identification and Description of Variables: Highway safety, rather the lack of it, is revealed by the total number of fatalities due to motor vehicle crashes. The percentage of those deaths occurring in work zones reveals a pattern over time (Figure 12). The data points for the graph are calculated using the following equation: WZP a b YEAR u= + + where WZP = work zone percentage of total fatalities, YEAR = calendar year, and u = an error term, as used here. 3. Data Collection: The base data are obtained from the Fatality Analysis Reporting System maintained by the National Highway Traffic Safety Administration (NHTSA), as reported at www.workzonesafety.org. The data are state specific as well as for the country as a whole, and cover a period of 26 years from 1982 through 2007. The numbers of fatalities from motor vehicle crashes in and not in construction/maintenance zones (work zones) are used to compute the percentage of fatalities in work zones for each of the 26 years. 4. Specification of Analysis Techniques and Data Analysis: Ordinary least squares (OLS) regression is used to develop the general model specified above. The discussion in this example focuses on the resulting model and the related statistics. (See also examples 15, 16, and 17 for details on calculations. For more information about OLS regression, see NCHRP Project 20-45, Volume 2, Chapter 4, Section B, “Linear Regression.”) Looking at the data in Figure 12 another way, WZP = -91.523 (-8.34) (0.000) + 0.047(YEAR) (8.51) (0.000) R = 0.867 t-values p-values R2 = 0.751 The trend is significant: the line (trend) shows an increase of 0.047% each year. Generally, this trend shows that work-zone fatalities are increasing as a percentage of total fatalities. 5. Interpreting the Results: This experiment is a good fit and generally shows that work-zone fatalities were an increasing problem over the period 1982 through 2007. This is a trend that highway officials, engineers, and planners would like to change. The analyst is therefore interested in anticipating the trajectory of the trend. Here the trend suggests that things are getting worse. Figure 12. Percentage of all motor vehicle fatalities occurring in work zones.

54 effective experiment Design and Data analysis in transportation research How far might authorities let things go—5%? 10%? 25%? Caution must be exercised when interpreting a trend beyond the limits of the available data. Technically the slope, or b-coefficient, is the trend of the relationship. The a-term from the regression, also called the intercept, is the value of WZP when the independent variable equals zero. The intercept for the trend in this example would technically indicate that the percentage of motor vehicle fatalities in work zones in the year zero would be -91.5%. This is absurd on many levels. There could be no motor vehicles in year zero, and what is a negative percentage of the total? The absurdity of the intercept in this example reveals that trends are limited concepts, limited to a relevant time frame. Figure 12 also suggests that the trend, while valid for the 26 years in aggregate, doesn’t work very well for the last 5 years, during which the percentages are consistently falling, not rising. Something seems to have changed around 2002; perhaps the highway officials, engineers, and planners took action to change the trend, in which case, the trend reversal would be considered a policy success. Finally, some underlying assumptions must be considered. For example, there is an implicit assumption that the types of roads with construction zones are similar from year to year. If this assumption is not correct (e.g., if a greater number of high speed roads, where fatalities may be more likely, are worked on in some years than in others), then interpreting the trend may not make much sense. 6. Conclusion and Discussion: The computation of this dependent variable (the percent of motor-vehicle fatalities occurring in work zones, or MZP) is influenced by changes in the number of work-zone fatalities and the number of non-work-zone fatalities. To some extent, both of these are random variables. Accordingly, it is difficult to distinguish a trend or trend reversal from a short series of possibly random movements in the same direction. Statistically, more observations permit greater confidence in non-randomness. It is also possible that a data series might be recorded that contains regular, non-random movements that are unrelated to a trend. Consider the dependent variable above (MZP), but measured using monthly data instead of annual data. Further, imagine looking at such data for a state in the upper Midwest instead of for the nation as a whole. In this new situation, the WZP might fall off or halt altogether each winter (when construction and maintenance work are minimized), only to rise again in the spring (reflecting renewed work-zone activity). This change is not a trend per se, nor is it random. Rather, it is cyclical. 7. Applications in Other Areas of Transportation Research: Applications of trend analysis models in other areas of transportation research include: • Transportation Safety—to identify trends in traffic crashes (e.g., motor vehicle/deer) over time on some part of the roadway system (e.g., freeways). • Public Transportation—to determine the trend in rail passenger trips over time (e.g., in response to increasing gas prices). • Pavement Engineering—to monitor the number of miles of pavement that is below some service-life threshold over time. • Environment—to monitor the hours of truck idling time in rest areas over time. Example 15: Structures/Bridges; Trend Analysis Area: Structures/bridges Method of Analysis: Trend analysis (examining a trend over time) 1. Research Question/Problem Statement: A state agency wants to monitor trends in the condition of bridge superstructures in order to perform long-term needs assessment for bridge rehabilitation or replacement. Bridge condition rating data will be analyzed for bridge

examples of effective experiment Design and Data analysis in transportation research 55 2. Identification and Description of Variables: Bridge inspection generally entails collection of numerous variables including location information, traffic data, structural elements (type and condition), and functional characteristics. Based on the severity of deterioration and the extent of spread through a bridge component, a condition rating is assigned on a dis- crete scale from 0 (failed) to 9 (excellent). Generally a condition rating of 4 or below indicates deficiency in a structural component. The state agency inspects approximately 300 bridges every year (denominator). The number of superstructures that receive a rating of 4 or below each year (number of events, numerator) also is recorded. The agency is concerned with the change in overall rate (calculated per 100) of structurally deficient bridge superstructures. This rate, which is simply the ratio of the numerator to the denominator, is the indicator (dependent variable) to be examined for trend over a time period of 15 years. Notice that the unit of analysis is the time period and not the individual bridge superstructures. 3. Data Collection: Data are collected for bridges scheduled for inspection each year. It is important to note that the bridge condition rating scale is based on subjective categories, and therefore there may be inherent variability among inspectors in their assignments of rates to bridge superstructures. Also, it is assumed that during the time period for which the trend analysis is conducted, no major changes are introduced in the bridge inspection methods. Sample data provided in Table 20 show the rate (per 100), number of bridges per year that received a score of four or below, and total number of bridges inspected per year. 4. Specification of Analysis Technique and Data Analysis: The data set consists of 15 observa- tions, one for each year. Figure 13 shows a scatter plot of the rate (dependent variable) versus time in years. The scatter plot does not indicate the presence of any outliers. The scatter plot shows a seemingly increasing linear trend in the rate of deficient superstructures over time. No need for data transformation or smoothing is apparent from the examination of the scatter plot in Figure 13. To determine whether the apparent linear trend is statistically significant in this data, ordinary least squares (OLS) regression can be employed. Question/Issue Use collected data to determine if the values that some variables have taken show an increasing trend or a decreasing trend over time. In this example, determine if levels of structural deficiency in bridge superstructures have been increasing or decreasing over time, and determine how rapidly the increase or decrease has occurred. No. Year Rate (per 100) Number of Events (Numerator) Number of Bridges Inspected (Denominator) 1 1990 8.33 25 300 2 1991 8.70 26 299 5 1994 10.54 31 294 11 2000 13.55 42 310 15 2004 14.61 45 308 Table 20. Sample bridge inspection data. superstructures that have been inspected over a period of 15 years. The objective of this study is to examine the overall pattern of change in the indicator variable over time.

56 effective experiment Design and Data analysis in transportation research The linear regression model takes the following form: y x ei o i i= + +β β1 where i = 1, 2, . . . , n (n = 15 in this example), y = dependent variable (rate of structurally deficient bridge superstructures), x = independent variable (time), bo = y-intercept (only provides reference point), b1 = slope (change in unit y for a change in unit x), and ei = residual error. The first step is to estimate the bo and b1 in the regression function. The residual errors (e) are assumed to be independently and identically distributed (i.e., they are mutually independent and have the same probability distribution). b1 and bo can be computed using the following equations: ˆ . ˆ β β 1 1 2 1 0 454= −( ) −( ) −( ) = = = = ∑ ∑ x x y y x x i i i n i i n o y x− =β1 8 396. where y _ is the overall mean of the dependent variable and x _ is the overall mean of the independent variable. The prediction equation for rate of structurally deficient bridge superstructures over time can be written using the following equation: ˆ ˆ ˆ . .y x xo= + = +β β1 8 396 0 454 That is, as time increases by a year, the rate of structurally deficient bridge superstructures increases by 0.454 per 100 bridges. The plot of the regression line is shown in Figure 14. Figure 14 indicates some small variability about the regression line. To conduct hypothesis testing for the regression relationship (Ho: b1 = 0), assessment of this variability and the assumption of normality would be required. (For a discussion on assumptions for residual errors, see NCHRP Project 20-45, Volume 2, Chapter 4.) Like analysis of variance (ANOVA, described in examples 8, 9, and 10), statistical inference is initiated by partitioning the total sum of squares (TSS) into the error sum of squares (SSE) Figure 13. Scatter plot of time versus rate. 7.00 9.00 11.00 13.00 15.00 Time in years Ra te p er 1 00 1 3 5 7 9 11 13 15

examples of effective experiment Design and Data analysis in transportation research 57 and the model sum of squares (SSR). That is, TSS = SSE + SSR. The TSS is defined as the sum of the squares of the difference of each observation from the overall mean. In other words, deviation of observation from overall mean (TSS) = deviation of observation from prediction (SSE) + deviation of prediction from overall mean (SSR). For our example, TSS y y SSR x x i i n i = −( ) = = −( ) = = ∑ 2 1 1 2 2 60 892 57 7 . ˆ .β 90 3 102 1i n SSE TSS SSR = ∑ = − = . Regression analysis computations are usually summarized in a table (see Table 21). The mean squared errors (MSR, MSE) are computed by dividing the sums of squares by corresponding model and error degrees of freedom. For the null hypothesis (Ho: b1 = 0) to be true, the expected value of MSR is equal to the expected value of MSE such that F = MSR/MSE should be a random draw from an F-distribution with 1, n - 2 degrees of freedom. From the regression shown in Table 21, F is computed to be 242.143, and the probability of getting a value larger than the F computed is extremely small. Therefore, the null hypothesis is rejected; that is, the slope is significantly different from zero, and the linearly increasing trend is found to be statistically significant. Notice that a slope of zero implies that knowing a value of the independent variable provides no insight on the value of the dependent variable. 5. Interpreting the Results: The linear regression model does not imply any cause-and-effect relationship between the independent and dependent variables. The y-intercept only provides a reference point, and the relationship need not be linear outside the data range. The 95% confidence interval for b1 is computed as [0.391, 0.517]; that is, the analyst is 95% confident that the true mean increase in the rate of structurally deficient bridge superstructures is between Plot of regression line y = 8.396 + 0.454x R2 = 0.949 7.00 9.00 11.00 13.00 15.00 1 3 5 7 9 11 13 15 Time in years Ra te p er 1 00 Figure 14. Plot of regression line. Source Sum of Squares (SS) Degrees of Freedom (df) Mean Square F Significance Regression 57.790 1 57.790 (MSR) 242.143 8.769e-10 Error 3.102 13 0.239 (MSE) Total 60.892 14 Table 21. Analysis of regression table.

58 effective experiment Design and Data analysis in transportation research 0.391% and 0.517% per year. (For a discussion on computing confidence intervals, see NCHRP Project 20-45, Volume 2, Chapter 4.) The coefficient of determination (R2) provides an indication of the model fit. For this example, R2 is calculated using the following equation: R SSE TSS 2 0 949= = . The R2 indicates that the regression model accounts for 94.9% of the total variation in the (hypothetical) data. It should be noted that such a high value of R2 is almost impossible to attain from analysis of real observational data collected over a long time. Also, distributional assumptions must be checked before proceeding with linear regression, as serious violations may indicate the need for data transformation, use of non-linear regression or non-parametric methods, and so on. 6. Conclusion and Discussion: In this example, simple linear regression has been used to deter- mine the trend in the rate of structurally deficient bridge superstructures in a geographic area. In addition to assessing the overall patterns of change, trend analysis may be performed to: • study the levels of indicators of change (or dependent variables) in different time periods to evaluate the impact of technical advances or policy changes; • compare different geographic areas or different populations with perhaps varying degrees of exposure in absolute and relative terms; and • make projections to monitor progress toward an objective. However, given the dynamic nature of trend data, many of these applications require more sophisticated techniques than simple linear regression. An important aspect of examining trends over time is the accuracy of numerator and denominator data. For example, bridge structures may be examined more than once during the analysis time period, and retrofit measures may be taken at some deficient bridges. Also, the age of structures is not accounted for in this analysis. For the purpose of this example, it is assumed that these (and other similar) effects are negligible and do not confound the data. In real-life application, however, if the analysis time period is very long, it becomes extremely important to account for changes in factors that may have affected the dependent variable(s) and their measurement. An example of the latter could be changes in the volume of heavy trucks using the bridge, changes in maintenance policies, or changes in plowing and salting regimes. 7. Applications in Other Areas of Transportation Research: Trend analysis is carried out in many areas of transportation research, such as: • Transportation Planning/Traffic Operations—to determine the need for capital improve- ments by examining traffic growth over time. • Traffic Safety—to study the trends in overall, fatal, and/or injury crash rates over time in a geographic area. • Pavement Engineering—to assess the long-term performance of pavements under varying loads. • Environment—to monitor the emission levels from commercial traffic over time with growth of industrial areas. Example 16: Transportation Planning; Multiple Regression Analysis Area: Transportation planning Method of Analysis: Multiple regression analysis (testing proposed linear models with more than one independent variable when all variables are continuous)

examples of effective experiment Design and Data analysis in transportation research 59 1. Research Question/Problem Statement: Transportation planners and engineers often work on variations of the classic four-step transportation planning process for estimat- ing travel demand. The first step, trip generation, generally involves developing a model that can be used to predict the number of trips originating or ending in a zone, which is a geographical subdivision of a corridor, city, or region (also referred to as a traffic analysis zone or TAZ). The objective is to develop a statistical relationship (a model) that can be used to explain the variation in a dependent variable based on the variation of one or more independent variables. In this example, ordinary least squares (OLS) regres- sion is used to develop a model between trips generated (the dependent variable) and demographic, socio-economic, and employment variables (independent variables) at the household level. Question/Issue Can a linear relationship (model) be developed between a dependent variable and one or more independent variables? In this application, the dependent variable is the number of trips produced by households. Independent variables include persons, workers, and vehicles in a household, household income, and average age of persons in the household. The basic question is whether the relationship between the dependent (Y) and independent (X) variables can be represented by a linear model using two coefficients (a and b), expressed as follows: Y X= +a b i where a = the intercept and b = the slope of the line. If the relationship being examined involves more than one independent variable, the equa- tion will simply have more terms. In addition, in a more formal presentation, the equation will also include an error term, e, added at the end. 2. Identification and Description of Variables: Data for four-step modeling of travel demand or for calibration of any specific model (e.g., trip generation or trip origins) come from a variety of sources, ranging from the U.S. Census to mail or telephone surveys. The data that are collected will depend, in part, on the specific purpose of the modeling effort. Data appropriate for a trip-generation model typically are collected from some sort of household survey. For the dependent variable in a trip-generation model, data must be collected on trip-making characteristics. These characteristics could include something as simple as the total trips made by a household in a day or involve more complicated break- downs by trip purpose (e.g., work-related trips versus shopping trips) and time of day (e.g., trips made during peak and non-peak hours). The basic issue that must be addressed is to determine the purpose of the proposed model: What is to be estimated or predicted? Weekdays and work trips normally are associated with peak congestion and are often the focus of these models. For the independent variable(s), the analyst must first give some thought to what would be the likely causes for household trips to vary. For example, it makes sense intuitively that household size might be pertinent (i.e., it seems reasonable that more persons in the household would lead to a higher number of household trips). Household members could be divided into workers and non-workers, two variables instead of one. Likewise, other socio-economic characteristics, such as income-related variables, might also make sense as candidate variables for the model. Data are collected on a range of candidate variables, and

60 effective experiment Design and Data analysis in transportation research the analysis process is used to sort through these variables to determine which combination leads to the best model. To be used in ordinary regression modeling, variables need to be continuous; that is, measured ratio or interval scale variables. Nominal data may be incorporated through the use of indicator (dummy) variables. (For more information on continuous variables, see NCHRP Project 20-45, Volume 2, Chapter 1; for more information on dummy variables, see NCHRP Project 20-45, Volume 2, Chapter 4). 3. Data Collection: As noted, data for modeling travel demand often come from surveys designed especially for the modeling effort. Data also may be available from centralized sources such as a state DOT or local metropolitan planning organization (MPO). 4. Specification of Analysis Techniques and Data Analysis: In this example, data for 178 house- holds in a small city in the Midwest have been provided by the state DOT. The data are obtained from surveys of about 15,000 households all across the state. This example uses only a tiny portion of the data set (see Table 22). Based on the data, a fairly obvious relationship is initially hypothesized: more persons in a household (PERS) should produce more person- trips (TRIPS). In its simplest form, the regression model has one dependent variable and one independent variable. The underlying assumption is that variation in the independent variable causes the variation in the dependent variable. For example, the dependent variable might be TRIPSi (the count of total trips made on a typical weekday), and the independent variable might be PERS (the total number of persons, or occupants, in the household). Expressing the relation- ship between TRIPS and PERS for the ith household in a sample of households results in the following hypothesized model: TRIPS PERSi i i= + +a b i ε where a and b are coefficients to be determined by ordinary least squares (OLS) regression analysis and ei is the error term. The difference between the value of TRIPS for any household predicted using the devel- oped equation and the actual observed value of TRIPS for that same household is called the residual. The resulting model is an equation for the best fit straight line (for the given data) where a is the intercept and b is the slope of the line. (For more information about fitted regression and measures of fit see NCHRP Project 20-45, Volume 2, Chapter 4). In Table 22, R is the multiple R, the correlation coefficient in the case of the simplest linear regression involving one variable (also called univariate regression). The R2 (coefficient of determination) may be interpreted as the proportion of the variance of the dependent variable explained by the fitted regression model. The adjusted R2 corrects for the number of independent variables in the equation. A “perfect” R2 of 1.0 could be obtained if one included enough independent variables (e.g., one for each observation), but doing so would hardly be useful. Coefficients t-values (statistics) p-values Measures of Fit a = 3.347 4.626 0.000 R = 0.510 b = 2.001 7.515 0.000 R2 = 0.260 Adjusted R2 = 0.255 Table 22. Regression model statistics.

examples of effective experiment Design and Data analysis in transportation research 61 Restating the now-calibrated model, TRIPS PERS= +4 626 7 515. . i The statistical significance of each coefficient estimate is evaluated with the p-values of calculated t-statistics, provided the errors are normally distributed. The p-values (also known as probability values) generally indicate whether the coefficients are significantly different from zero (which they need to be in order for the model to be useful). More formally stated, a p-value is the probability of a Type I error. In this example, the t- and p-values shown in Table 22 indicate that both a and b are sig- nificantly different from zero at a level of significance greater than the 99.9% confidence level. P-values are generally offered as two-tail (two-sided hypothesis testing) test values in results from most computer packages; one-tail (one-sided) values may sometimes be obtained by dividing the printed p-values by two. (For more information about one-sided versus two- sided hypothesis testing, see NCHRP Project 20-45, Volume 2, Chapter 4.) The R2 may be tested with an F-statistic; in this example, the F was calculated as 56.469 (degrees of freedom = 2, 176) (See NCHRP Project 20-45, Volume 2, Chapter 4). This means that the model explains a significant amount of the variation in the dependent variable. A plot of the estimated model (line) and the actual data are shown in Figure 15. A strict interpretation of this model suggests that a household with zero occupants (PERS = 0) will produce 3.347 trips per day. Clearly, this is not feasible because there can’t be a household of zero persons, which illustrates the kind of problem encountered when a model is extrapolated beyond the range of the data used for the calibration. In other words, a formal test of the intercept (the a) is not always meaningful or appropriate. Extension of the Model to Multivariate Regression: When the list of potential inde- pendent variables is considered, the researcher or analyst might determine that more than one cause for variation in the dependent variable may exist. In the current example, the question of whether there is more than one cause for variation in the number of trips can be considered. 0 1 2 3 4 5 6 7 8 9 10 PERS 0 10 20 30 40 TR IP S Figure 15. Plot of the line for the estimated model.

62 effective experiment Design and Data analysis in transportation research The model just discussed for evaluating the effect of one independent variable is called a uni- variate model. Should the final model for this example be multivariate? Before determining the final model, the analyst may want to consider whether a variable or variables exist that further clarify what has already been modeled (e.g., more persons cause more trips). The variable PERS is a crude measure, made up of workers and non-workers. Most households have one or two workers. It can be shown that a measure of the non-workers in the household is more effective in explaining trips than is total persons; so a new variable, persons minus workers (DEP), is calculated. Next, variables may exist that address entirely different causal relationships. It might be hypothesized that as the number of registered motor vehicles available in the household (VEH) increases, the number of trips will increase. It may also be argued that as household income (INC, measured in thousands of dollars) increases, the number of trips will increase. Finally, it may be argued that as the average age of household occupants (AVEAGE) increases, the number of trips will decrease because retired people generally make fewer trips. Each of these statements is based upon a logical argument (hypothesis). Given these arguments, the hypothesized multivariate model takes the following form: TRIPS DEP VEH INC AVEAGE= + + + + +a b c d ei i i i ε The results from fitting the multivariate model are given in Table 23. Results of the analysis of variance (ANOVA) for the overall model are shown in Table 24. 5. Interpreting the Results: It is common for regression packages to provide some values in scientific notation as shown for the p-values in Table 23. The coefficient d, showing the relationship of TRIPS with INC, is read 1.907 E-05, which in turn is read as 1.907  10-5 or 0.000001907. All coefficients are of the expected sign and significantly different from 0 (at the 0.05 level) except for d. However, testing the intercept makes little sense. (The intercept value would be the number of trips for a household with 0 vehicles, 0 income, 0 average age, and 0 depen- dents, a most unlikely household.) The overall model is significant as shown by the F-ratio and its p-value, meaning that the model explains a significant amount of the variation in Coefficients t-values (statistics) p-values Measures of Fit a = 8.564 6.274 3.57E-09* R = 0.589 b = 0.899 2.832 0.005 R2 = 0.347 c = 1.067 3.360 0.001 adjusted R2 = 0.330 d = 1.907E-05* 1.927 0.056 e = -0.098 -4.808 3.68E-06 *See note about scientific notation in Section 5, Interpreting the Results. Table 23. Results from fitting the multivariate model. ANOVA Sum of Squares (SS) Degrees of Freedom (df) F-ratio p-value Regression 1487.5 4 19.952 3.4E-13 Residual 2795.7 150 Table 24. ANOVA results for the overall model.

examples of effective experiment Design and Data analysis in transportation research 63 the dependent variable. This model should reliably explain 33% of the variance of house- hold trip generation. Caution should be exercised when interpreting the significance of the R2 and the overall model because it is not uncommon to have a significant F-statistic when some of the coefficients in the equation are not significant. The analyst may want to consider recalibrating the model without the income variable because the coefficient d was insignificant. 6. Conclusion and Discussion: Regression, particularly OLS regression, relies on several assumptions about the data, the nature of the relationships, and the results. Data are assumed to be interval or ratio scale. Independent variables generally are assumed to be measured without error, so all error is attributed to the model fit. Furthermore, indepen- dent variables should be independent of one another. This is a serious concern because the presence in the model of related independent variables, called multicollinearity, compro- mises the t-tests and confuses the interpretation of coefficients. Tests of this problem are available in most statistical software packages that include regression. Look for Variance- Inflation Factor (VIF) and/or Tolerance tests; most packages will have one or the other, and some will have both. In the example above where PERS is divided into DEP and workers, knowing any two variables allows the calculation of the third. Including all three variables in the model would be a case of extreme multicollinearity and, logically, would make no sense. In this instance, because one variable is a linear combination of the other two, the calculations required (within the analysis program) to calibrate the model would actually fail. If the independent variables are simply highly correlated, the regression coefficients (at a minimum) may not have intuitive meaning. In general, equations or models with highly correlated independent variables are to be avoided; alternative models that examine one variable or the other, but not both, should be analyzed. It is also important to analyze the error distributions. Several assumptions relate to the errors and their distributions (normality, constant variance, uncorrelated, etc.) In transportation plan- ning, spatial variables and associations might become important; they require more elaborate constructs and often different estimation processes (e.g., Bayesian, Maximum Likelihood). (For more information about errors and error distributions, see NCHRP Project 20-45, Volume 2, Chapter 4.) Other logical considerations also exist. For example, for the measurement units of the different variables, does the magnitude of the result of multiplying the coefficient and the measured variable make sense and/or have a reasonable effect on the predicted magnitude of the dependent variable? Perhaps more importantly, do the independent variables make sense? In this example, does it make sense that changes in the number of vehicles in the household would cause an increase or decrease in the number of trips? These are measures of operational significance that go beyond consideration of statistical significance, but are no less important. 7. Applications in Other Areas of Transportation Research: Regression is a very important technique across many areas of transportation research, including: • Transportation Planning – to include the other half of trip generation, e.g., predicting trip destinations as a function of employment levels by various types (factory, commercial), square footage of shopping center space, and so forth. – to investigate the trip distribution stage of the 4-step model (log transformation of the gravity model). • Public Transportation—to predict loss/liability on subsidized freight rail lines (function of segment ton-miles, maintenance budgets and/or standards, operating speeds, etc.) for self-insurance computations. • Pavement Engineering—to model pavement deterioration (or performance) as a function of easily monitored predictor variables.

64 effective experiment Design and Data analysis in transportation research Example 17: Traffic Operations; Regression Analysis Area: Traffic operations Method of Analysis: Regression analysis (developing a model to predict the values that some variable can take as a function of one or more other variables, when not all variables are assumed to be continuous) 1. Research Question/Problem Statement: An engineer is concerned about false capacity at inter- sections being designed in a specified district. False capacity occurs where a lane is dropped just beyond a signalized intersection. Drivers approaching the intersection and knowing that the lane is going to be dropped shortly afterward avoid the lane. However, engineers estimating the capacity and level of service of the intersection during design have no reliable way to estimate the percentage of traffic that will avoid the lane (the lane distribution). Question/Issue Develop a model that can be used to predict the values that a dependent vari- able can take as a function of changes in the values of the independent variables. In this particular instance, how can engineers make a good estimate of the lane distribution of traffic volume in the case of a lane drop just beyond an intersec- tion? Can a linear model be developed that can be used to predict this distribu- tion based on other variables? The basic question is whether a linear relationship exists between the dependent variable (Y; in this case, the lane distribution percentage) and some independent variable(s) (X). The relationship can be expressed using the following equation: Y X= +a b i where a is the intercept and b is the slope of the line (see NCHRP Project 20-45, Volume 2, Chapter 4, Section B). 2. Identification and Description of Variables: The dependent variable of interest in this example is the volume of traffic in each lane on the approach to a signalized intersection with a lane drop just beyond. The traffic volumes by lane are converted into lane utilization factors (fLU), to be consistent with standard highway capacity techniques. The Highway Capacity Manual defines fLU using the following equation: f v v N LU g g = ( )1 where Vg is the flow rate in a lane group in vehicles per hour, Vg1 is the flow rate in the lane with the highest flow rate of any in the group in vehicles per hour, and N is the number of lanes in the lane group. The engineer thinks that lane utilization might be explained by one or more of 15 different factors, including the type of lane drop, the distance from the intersection to the lane drop, the taper length, and the heavy vehicle percentage. All of the variables are continuous except the type of lane drop. The type of lane drop is used to categorize the sites. 3. Data Collection: The engineer locates 46 lane-drop sites in the area and collects data at these sites by means of video recording. The engineer tapes for up to 3 hours at each site. The data are summarized in 15-minute periods, again to be consistent with standard highway capacity practice. For one type of lane-drop geometry, with two through lanes and an exclusive right- turn lane on the approach to the signalized intersection, the engineer ends up with 88 valid

examples of effective experiment Design and Data analysis in transportation research 65 data points (some sites have provided more than one data point), covering 15 minutes each, to use in equation (model) development. 4. Specification of Analysis Technique and Data Analysis: Multiple (or multivariate) regression is a standard statistical technique to develop predictive equations. (More information on this topic is given in NCHRP Project 20-45, Volume 2, Chapter 4, Section B). The engineer performs five steps to develop the predictive equation. Step 1. The engineer examines plots of each of the 15 candidate variables versus fLU to see if there is a relationship and to see what forms the relationships might take. Step 2. The engineer screens all 15 candidate variables for multicollinearity. (Multicollinearity occurs when two variables are related to each other and essentially contribute the same informa- tion to the prediction.) Multicollinearity can lead to models with poor predicting power and other problems. The engineer examines the variables for multicollinearity by • looking at plots of each of the 15 candidate variables against every other candidate variable; • calculating the correlation coefficient for each of the 15 candidate independent variables against every other candidate variable; and • using more sophisticated tests (such as the variance influence factor) that are available in statistical software. Step 3. The engineer reduces the set of candidate variables to eight. Next, the engineer uses statistical software to select variables and estimate the coefficients for each selected variable, assuming that the regression equation has a linear form. To select variables, the engineer employs forward selection (adding variables one at a time until the equation fit ceases to improve significantly) and backward elimination (starting with all candidate variables in the equation and removing them one by one until the equation fit starts to deteriorate). The equation fit is measured by R2 (for more information, see NCHRP Project 20-45, Volume 2, Chapter 4, Section B, under the heading, “Descriptive Measures of Association Between X and Y”), which shows how well the equation fits the data on a scale from 0 to 1, and other factors provided by statistical software. In this case, forward selection and backward elimination result in an equation with five variables: • Drop: Lane drop type, a 0 or 1 depending on the type; • Left: Left turn status, a 0 or 1 depending on the types of left turns allowed; • Length: The distance from the intersection to the lane drop, in feet ÷ 1000; • Volume: The average lane volume, in vehicles per hour per lane ÷ 1000; and • Sign: The number of signs warning of the lane drop. Notice that the first two variables are discrete variables and had to assume a zero-or-one format to work within the regression model. Each of the five variables has a coefficient that is significantly different from zero at the 95% confidence level, as measured by a t-test. (For more information, see NCHRP Project 20-45, Volume 2, Chapter 4, Section B, “How Are t-statistics Interpreted?”) Step 4. Once an initial model has been developed, the engineer plots the residuals for the tentative equation to see whether the assumed linear form is correct. A residual is the differ- ence, for each observation, between the prediction the equation makes for fLU and the actual value of fLU. In this example, a plot of the predicted value versus the residual for each of the 88 data points shows a fan-like shape, which indicates that the linear form is not appropriate. (NCHRP Project 20-45, Volume 2, Chapter 4, Section B, Figure 6 provides examples of residual plots that are and are not desirable.) The engineer experiments with several other model forms, including non-linear equations that involve transformations of variables, before settling on a lognormal form that provides a good R2 value of 0.73 and a desirable shape for the residual plot.

66 effective experiment Design and Data analysis in transportation research Step 5. Finally, the engineer examines the candidate equation for logic and practicality, asking whether the variables make sense, whether the signs of the variables make sense, and whether the variables can be collected easily by design engineers. Satisfied that the answers to these questions are “yes,” the final equation (model) can be expressed as follows: f Drop Left LLU = − − + +exp . . . .0 539 0 218 0 148 0 178i i i ength Volume Sign+ −( )0 627 0 105. .i i 5. Interpreting the Results: The process described in this example results in a useful equation for estimating the lane utilization in a lane to be dropped, thereby avoiding the estimation of false capacity. The equation has five terms and is non-linear, which will make its use a bit challenging. However, the database is large, the equation fits the data well, and the equation is logical, which should boost the confidence of potential users. If potential users apply the equation within the ranges of the data used for the calibration, the equation should provide good predictions. Applying any model outside the range of the data on which it was calibrated increases the likelihood of an inaccurate prediction. 6. Conclusion and Discussion: Regression is a powerful statistical technique that provides models engineers can use to make predictions in the absence of direct observation. Engineers tempted to use regression techniques should notice from this and other examples that the effort is substantial. Engineers using regression techniques should not skip any of the steps described above, as doing so may result in equations that provide poor predictions to users. Analysts considering developing a regression model to help make needed predictions should not be intimidated by the process. Although there are many pitfalls in developing a regression model, analysts considering making the effort should also consider the alternative: how the prediction will be made in the absence of a model. In the absence of a model, predic- tions of important factors like lane utilization would be made using tradition, opinion, or simple heuristics. With guidance from NCHRP Project 20-45 and other texts, and with good software available to make the calculations, credible regression models often can be developed that perform better than the traditional prediction methods. Because regression models developed by transportation engineers are often reused in later studies by others, the stakes are high. The consequences of a model that makes poor pre- dictions can be severe in terms of suboptimal decisions. Lane utilization models often are employed in traffic studies conducted to analyze new development proposals. A model that under-predicts utilization in a lane to be dropped may mean that the development is turned down due to the anticipated traffic impacts or that the developer has to pay for additional and unnecessary traffic mitigation measures. On the other hand, a model that over-predicts utilization in a lane to be dropped may mean that the development is approved with insufficient traffic mitigation measures in place, resulting in traffic delays, collisions, and the need for later intervention by a public agency. 7. Applications in Other Areas of Transportation Research: Regression is used in almost all areas of transportation research, including: • Transportation Planning—to create equations to predict trip generation and mode split. • Traffic Safety—to create equations to predict the number of collisions expected on a particular section of road. • Pavement Engineering/Materials—to predict long-term wear and condition of pavements. Example 18: Transportation Planning; Logit and Related Analysis Area: Transportation planning Method of Analysis: Logit and related analysis (developing predictive models when the dependent variable is dichotomous—e.g., 0 or 1)

examples of effective experiment Design and Data analysis in transportation research 67 2. Identification and Description of Variables: Considering a typical, traditional urban area in the United States, it is reasonable to argue that the likelihood of taking public transit to work (Y) will be a function of income (X). Generally, more income means less likelihood of taking public transit. This can be modeled using the following equation: Y X ui i i= + +β β1 2 where Xi = family income, Y = 0 if the family uses public transit, and Y = 1 if the family doesn’t use public transit. 3. Data Collection: These data normally are obtained from travel surveys conducted at the local level (e.g., by a metropolitan area or specific city), although the agency that collects the data often is a state DOT. 4. Specification of Analysis Techniques and Data Analysis: In this example the dependent variable is dichotomous and is a linear function of an explanatory variable. Consider the equation E(YiXi) = b1 + b2Xi. Notice that if Pi = probability that Y = 1 (household utilizes transit), then (1 - Pi) = probability that Y = 0 (doesn’t utilize transit). This has been called a linear probability model. Note that within this expression, “i” refers to a household. Thus, Y has the distribution shown in Table 25. Any attempt to estimate this relationship with standard (OLS) regression is saddled with many problems (e.g., non-normality of errors, heteroscedasticity, and the possibility that the predicted Y will be outside the range 0 to 1, to say nothing of pretty terrible R2 values). Question/Issue Can a linear model be developed that can be used to predict the probability that one of two choices will be made? In this example, the question is whether a household will use public transit (or not). Rather than being continuous (as in linear regression), the dependent variable is reduced to two categories, a dichotomous variable (e.g., yes or no, 0 or 1). Although the question is simple, the statistical modeling becomes sophisticated very quickly. 1. Research Question/Problem Statement: Transportation planners often utilize variations of the classic four-step transportation planning process for predicting travel demand. Trip generation, trip distribution, mode split, and trip assignment are used to predict traffic flows under a variety of forecasted changes in networks, population, land use, and controls. Mode split, deciding which mode of transportation a traveler will take, requires predicting mutually exclusive outcomes. For example, will a traveler utilize public transit or drive his or her own car? Table 25. Distribution of Y. Values that Y Takes Probability Meaning/Interpretation 1 Pi Household uses transit 0 1 – Pi Household does not use transit 1.0 Total

68 effective experiment Design and Data analysis in transportation research An alternative formulation for estimating Pi, the cumulative logistic distribution, is expressed by the following equation: Pi Xi = + − +( ) 1 1 1 2ε β β This function can be plotted as a lazy Z-curve where on the left, with low values of X (low household income), the probability starts near 1 and ends at 0 (Figure 16). Notice that, even at 0 income, not all households use transit. The curve is said to be asymptotic to 1 and 0. The value of Pi varies between 1 and 0 in relation to income, X. Manipulating the definition of the cumulative logistic distribution from above, 1 11 2+( ) =− +( )ε β β Xi iP P Pi i Xi+( ) =− +( )ε β β1 2 1 P Pi Xi iε β β− +( ) = −1 2 1 ε β β− +( ) = −1 2 1Xi i i P P and ε β β1 2 1 +( ) = − Xi i i P P The final expression is the ratio of the probability of utilizing public transit divided by the probability of not utilizing public transit. It is called the odds ratio. Next, taking the natural log of both sides (and reversing) results in the following equation: L P P Xi i i i= −   = +ln 1 1 2β β L is called the logit, and this is called a logit model. The left side is the natural log of the odds ratio. Unfortunately, this odds ratio is meaningless for individual households where the prob- ability is either 0 or 1 (utilize or not utilize). If the analyst uses standard OLS regression on this Figure 16. Plot of cumulative logistic distribution showing a lazy Z-curve.

examples of effective experiment Design and Data analysis in transportation research 69 equation, with data for individual households, there is a problem because when Pi happens to equal either 0 or 1 (which is all the time!), the odds ratio will, as a result, equal either 0 or infinity (and the logarithm will be undefined) for all observations. However, by using groups of households the problem can be mitigated. Table 26 presents data based on a survey of 701 households, more than half of which use transit (380). The income data are recorded for intervals; here, interval mid-points (Xj) are shown. The number of households in each income category is tallied (Nj), as is the number of households in each income category that utilizes public transit (nj). It is important to note that while there are more than 700 households (i), the number of observations (categories, j) is only 13. Using these data, for each income bracket, the probability of taking transit can be estimated as follows: P n N j j j  = This equation is an expression of relative frequency (i.e., it expresses the proportion in income bracket “j” using transit). An examination of Table 26 shows clearly that there is progression of these relative frequen- cies, with higher income brackets showing lower relative frequencies, just as was hypothesized. We can calculate the odds ratio for each income bracket listed in Table 26 and estimate the following logit function with OLS regression: L n N n N Xj j j j j j= −       = +ln 1 1 2β β The results of this regression are shown in Table 27. The results also can be expressed as an equation: LogOddsRatio X= −1 037 0 00003863. .  5. Interpreting the Results: This model provides a very good fit. The estimates of the coefficients can be inserted in the original cumulative logistic function to directly estimate the probability of using transit for any given X (income level). Indeed, the logistic graph in Figure 16 is produced with the estimated function. Xj ($) Nj (Households) nj (Utilizing Transit) Pj (Defined Above) $6,000 40 30 0.750 $8,000 55 39 0.709 $10,000 65 43 0.662 $13,000 88 58 0.659 $15,000 118 69 0.585 $20,000 81 44 0.543 $25,000 70 33 0.471 $30,000 62 25 0.403 $35,000 40 16 0.400 $40,000 30 11 0.367 $50,000 22 6 0.273 $60,000 18 4 0.222 $75,000 12 2 0.167 Total: 701 380 Table 26. Data examined by groups of households.

70 effective experiment Design and Data analysis in transportation research 6. Conclusion and Discussion: This approach to estimation is not without further problems. For example, the N within each income bracket needs to be sufficiently large that the relative fre- quency (and therefore the resulting odds ratio) is accurately estimated. Many statisticians would say that a minimum of 25 is reasonable. This approach also is limited by the fact that only one independent variable is used (income). Common sense suggests that the right-hand side of the function could logically be expanded to include more than one predictor variable (more Xs). For example, it could be argued that educational level might act, along with income, to account for the probability of using transit. However, combining predictor variables severely impinges on the categories (the j) used in this OLS regression formulation. To illustrate, assume that five educational categories are used in addition to the 13 income brackets (e.g., Grade 8 or less, high school graduate to Grade 9, some college, BA or BS degree, and graduate degree). For such an OLS regression analysis to work, data would be needed for 5 × 13, or 65 categories. Ideally, other travel modes should also be considered. In the example developed here, only transit and not-transit are considered. In some locations it is entirely reasonable to examine private auto versus bus versus bicycle versus subway versus light rail (involving five modes, not just two). This notion of a polychotomous logistic regression is possible. However, five modes cannot be estimated with the OLS regression technique employed above. The logit above is a variant of the binomial distribution and the polychotomous logistic model is a variant of the multi- nomial distribution (see NCHRP Project 20-45, Volume 2, Chapter 5). Estimation of these more advanced models requires maximum likelihood methods (as described in NCHRP Project 20-45, Volume 2, Chapter 5). Other model variants are based upon other cumulative probability distributions. For exam- ple, there is the probit model, in which the normal cumulative density function is used. The probit model is very similar to the logit model, but it is more difficult to estimate. 7. Applications in Other Areas of Transportation Research: Applications of logit and related models abound within transportation studies. In any situation in which human behavior is relegated to discrete choices, the category of models may be applied. Examples in other areas of transportation research include: • Transportation Planning—to model any “choice” issue, such as shopping destination choices. • Traffic Safety—to model dichotomous responses (e.g., did a motorist slow down or not) in response to traffic control devices. • Highway Design—to model public reactions to proposed design solutions (e.g., support or not support proposed road diets, installation of roundabouts, or use of traffic calming techniques). Example 19: Public Transit; Survey Design and Analysis Area: Public transit Method of Analysis: Survey design and analysis (organizing survey data for statistical analysis) Coefficients t-values (statistics) p-values Measures of “Fit” 1 = 1.037 12.156 0.000 R = 0.980 2 = -0.00003863 β β -16.407 0.000 R2 = 0.961 adjusted R2 = 0.957 Table 27. Results of OLS regression.

examples of effective experiment Design and Data analysis in transportation research 71 2. Identification and Description of Variables: Two types of variables are needed for this analysis. The first is data on the characteristics of the riders, such as gender, age, and access to an automobile. These data are discrete variables. The second is data on the riders’ stated responses to proposed changes in the fare or service characteristics. These data also are treated as discrete variables. Although some, like the fare, could theoretically be continuous, they are normally expressed in discrete increments (e.g., $1.00, $1.25, $1.50). 3. Data Collection: These data are normally collected by agencies conducting a survey of the transit users. The initial step in the experiment design is to choose the variables to be collected for each of these two data sets. The second step is to determine how to categorize the data. Both steps are generally based on past experience and common sense. Some of the variables used to describe the characteristics of the transit user are dichotomous, such as gender (male or female) and access to an automobile (yes or no). Other variables, such as age, are grouped into discrete categories within which the transit riding characteristics are similar. For example, one would not expect there to be a difference between the transit trip needs of a 14-year-old student and a 15-year-old student. Thus, the survey responses of these two age groups would be assigned to the same age category. However, experience (and common sense) leads one to differentiate a 19-year-old transit user from a 65-year-old transit user, because their purposes for taking trips and their perspectives on the relative value of the fare and the service components are both likely to be different. Obtaining user responses to changes in the fare or service is generally done in one of two ways. The first is to make a statement and ask the responder to mark one of several choices: strongly agree, agree, neither agree nor disagree, disagree, and strongly disagree. The number of statements used in the survey depends on how many parameter changes are being contemplated. Typical statements include: 1. I would increase the number of trips I make each month if the fare were reduced by $0.xx. 2. I would increase the number of trips I make each month if I could purchase a monthly pass. 3. I would increase the number of trips I make each month if the waiting time at the stop were reduced by 10 minutes. 4. I would increase the number of trips I make each month if express services were available from my origin to my destination. The second format is to propose a change and provide multiple choices for the responder. Typical questions for this format are: 1. If the fare were increased by $0.xx per trip I would: a) not change the number of trips per month b) reduce the non-commute trips c) reduce both the commute and non-commute trips d) switch modes 2. If express service were offered for an additional $0.xx per trip I would: a) not change the number of trips per month on this local service b) make additional trips each month c) shift from the local service to the express service Question/Issue Use and analysis of data collected in a survey. Results from a survey of transit users are used to estimate the change in ridership that would result from a change in the service or fare. 1. Research Question/Problem Statement: The transit director is considering changes to the fare structure and the service characteristics of the transit system. To assist in determining which changes would be most effective or efficient, a survey of the current transit riders is developed.

72 effective experiment Design and Data analysis in transportation research These surveys generally are administered by handing a survey form to people as they enter the transit vehicle and collecting them as people depart the transit vehicle. The surveys also can be administered by mail, telephone, or in a face-to-face interview. In constructing the questions, care should be taken to use terms with which the respondents will be familiar. For example, if the system does not currently offer “express” service, this term will need to be defined in the survey. Other technical terms should be avoided. Similarly, the word “mode” is often used by transportation professionals but is not commonly used by the public at large. The length of a survey is almost always an issue as well. To avoid asking too many questions, each question needs to be reviewed to see if it is really necessary and will produce useful data (as opposed to just being something that would be nice to know). 4. Specification of Analysis Technique and Data Analysis: The results of these surveys often are displayed in tables or in frequency distribution diagrams (see also Example 1 and Example 2). Table 28 lists responses to a sample question posed in the form of a statement. Figure 17 shows the frequency diagram for these data. Similar presentations can be made for any of the groupings included in the first type of variables discussed above. For example, if gender is included as a Type 1 question, the results might appear as shown in Table 29 and Figure 18. Figure 18 shows the frequency diagram for these data. Presentations of the data can be made for any combination of the discrete variable groups included in the survey. For example, to display responses of female users over 65 years old, Strongly Agree Agree Neither Agree nor Disagree Disagree Strongly Disagree Total responses 450 600 300 400 100 Table 28. Table of responses to sample statement, “I would increase the number of trips I make each month if the fare were reduced by $0.xx.” 450 600 300 400 100 0 50 100 150 200 250 300 350 400 450 500 550 600 Strongly agree agree neither agree nor disagree disagree strongly disagree Figure 17. Frequency diagram for total responses to sample statement.

examples of effective experiment Design and Data analysis in transportation research 73 all of the survey forms on which these two characteristics (female and over 65 years old) are checked could be extracted and recorded in a table and shown in a frequency diagram. 5. Interpreting the Results: Survey data can be used to compare the responses to fare or service changes of different groups of transit users. This flexibility can be important in determining which changes would impact various segments of transit users. The information can be used to evaluate various fare and service options being considered and allows the transit agency to design promotions to obtain the greatest increase in ridership. For example, by creating fre- quency diagrams to display the responses to statements 2, 3, and 4 listed in Section 3, the engi- neer can compare the impact of changing the fare versus changing the headway or providing express services in the corridor. Organizing response data according to different characteristics of the user produces con- tingency tables like the one illustrated for males and females. This table format can be used to conduct chi-square analysis to determine if there is any statistically significant difference among the various groups. (Chi-square analysis is described in more detail in Example 4.) 6. Conclusions and Discussion: This example illustrates how to obtain and present quan- titative information using surveys. Although survey results provide reasonably good esti- mates of the relative importance users place on different transit attributes (fare, waiting time, hours of service, etc.), when determining how often they would use the system, the magnitude of users’ responses often is overstated. Experience shows that what users say they would do (their stated preference) generally is different than what they actually do (their revealed preference). Strongly Agree Agree Neither Agree nor Disagree Disagree Strongly Disagree Male 200 275 200 200 70 Female 250 325 100 200 30 Total responses 450 600 300 400 100 Table 29. Contingency table showing responses by gender to sample statement, “I would increase the number of trips I make each month if the fare were reduced by $0.xx.” 200 275 200 200 70 250 325 100 200 30 0 50 100 150 200 250 300 350 Strongly agree agree neither agree nor disagree disagree strongly disagree Male Female Figure 18. Frequency diagram showing responses by gender to sample statement.

74 effective experiment Design and Data analysis in transportation research In this example, 1,050 of the 1,850 respondents (57%) have responded that they would use the bus service more frequently if the fare were decreased by $0.xx. Five hundred respondents (27%) have indicated that they would not use the bus service more frequently, and 300 respondents (16%) have indicated that they are not sure if they would change their bus use frequency. These percentages show the stated preferences of the users. The engineer does not yet know the revealed preferences of the users, but experience suggests that it is unlikely that 57% of the riders would actually increase the number of trips they make. 7. Applications in Other Area in Transportation: Survey design and analysis techniques can be used to collect and present data in many areas of transportation research, including: • Transportation Planning—to assess public response to a proposal to enact a local motor fuel tax to improve road maintenance in a city or county. • Traffic Operations—to assess public response to implementing road diets (e.g., 4-lane to 3-lane conversions) on different corridors in a city. • Highway Design—to assess public response to proposed alternative cross-section designs, such as a boulevard design versus an undivided multilane design in a corridor. Example 20: Traffic Operations; Simulation Area: Traffic operations Method of Analysis: Simulation (using field data to simulate, or model, operations or outcomes) 1. Research Question/Problem Statement: A team of engineers wants to determine whether one or more unconventional intersection designs will produce lower travel times than a conventional design at typical intersections for a given number of lanes. There is no way to collect field data to compare alternative intersection designs at a particular site. Macroscopic traffic operations models like those in the Highway Capacity Manual do a good job of estimating delay at specific points but are unable to provide travel time estimates for unconventional designs that consist of several smaller intersections and road segments. Microscopic simulation models measure the behaviors of individual vehicles as they traverse the highway network. Such simulation models are therefore very flexible in the types of networks and measures that can be examined. The team in this example turns to a simulation model to determine how other intersection designs might work. Question/Issue Developing and using a computer simulation model to examine operations in a computer environment. In this example, a traffic operations simulation model is used to show whether one or more unconventional intersection designs will produce lower travel times than a conventional design at typical intersections for a given number of lanes. 2. Identification and Description of Variables: The engineering team simulates seven different intersections to provide the needed scope for their findings. At each intersection, the team examines three different sets of traffic volumes: volumes from the evening (p.m.) peak hour, a typical midday off-peak hour, and a volume that is 15% greater than the p.m. peak hour to represent future conditions. At each intersection, the team models the current conventional intersection geometry and seven unconventional designs: the quadrant roadway, median U-turn, superstreet, bowtie, jughandle, split intersection, and continuous flow intersection. Traffic simulation models break the roadway network into nodes (intersections) and links (segments between intersections). Therefore, the engineering team has to design each of the

examples of effective experiment Design and Data analysis in transportation research 75 alternatives at each test site in terms of numbers of lanes, lane lengths, and such, and then faithfully translate that geometry into links and nodes that the simulation model can use. For each combination of traffic volume and intersection design, the team uses software to find the optimum signal timing and uses that during the simulation. To avoid bias, the team keeps all other factors (e.g., network size, numbers of lanes, turn lane lengths, truck percentages, average vehicle speeds) constant in all simulation runs. 3. Data Collection: The field data collection necessary in this effort consists of noting the current intersection geometries at the seven test intersections and counting the turning movements in the time periods described above. In many simulation efforts, it is also necessary to collect field data to calibrate and validate the simulation model. Calibration is the process by which simulation output is compared to actual measurements for some key measure(s) such as travel time. If a difference is found between the simulation output and the actual measurement, the simulation inputs are changed until the difference disappears. Validation is a test of the calibrated simulation model, comparing simulation output to a previously unused sample of actual field measurements. In this example, however, the team determines that it is unnecessary to collect calibration and validation data because a recent project has successfully calibrated and validated very similar models of most of these same unconventional designs. The engineer team uses the CORSIM traffic operations simulation model. Well known and widely used, CORSIM models the movement of each vehicle through a specified network in small time increments. CORSIM is a good choice for this example because it was originally designed for problems of this type, has produced appropriate results, has excellent animation and other debugging features, runs quickly in these kinds of cases, and is well-supported by the software developers. The team makes two CORSIM runs with different random number seeds for each combina- tion of volume and design at each intersection, or 48 runs for each intersection altogether. It is necessary to make more than one run (or replication) of each simulation combination with different random number seeds because of the randomness built into simulation models. The experiment design in this case allows the team to reduce the number of replications to two; typical practice in simulations when one is making simple comparisons between two variables is to make at least 5 to 10 replications. Each run lasts 30 simulated minutes. Table 30 shows the simulation data for one of the seven intersections. The lowest travel time produced in each case is bolded. Notice that Table 30 does not show data for the bowtie design. That design became congested (gridlocked) and produced essentially infinite travel times for this intersection. Handling overly congested networks is a difficult problem in many efforts and with several different simulation software packages. The best current advice is for analysts to not push their networks too hard and to scan often for gridlock. 4. Specification of Analysis Technique and Data Analysis: The experiment assembled in this example uses a factorial design. (Factorial design also is discussed in Example 11.) The team analyzes the data from this factorial experiment using analysis of variance (ANOVA). Because Time of Day Total Travel Time, Vehicle-hours, Average of Two Simulation Runs Conventional Quadrant Median U Superstreet Jughandle Split Continuous Midday 67 64 61 74 63 59* 75 P.M. peak 121 95 119 179 139 114 106 Peak + 15% 170 *Lowest total travel time. 135 145 245 164 180 142 Table 30. Simulation results for different designs and time of day.

76 effective experiment Design and Data analysis in transportation research the experimenter has complete control in a simulation, it is common to use efficient designs like factorials and efficient analysis methods like ANOVA to squeeze all possible information out of the effort. Statistical tests comparing the individual mean values of key results by factor are common ways to follow up on ANOVA results. Although ANOVA will reveal which factors make a significant contribution to the overall variance in the dependent variable, means tests will show which levels of a significant factor differ from the other levels. In this example, the team uses Tukey’s means test, which is available as part of the battery of standard tests accom- panying ANOVA in statistical software. (For more information about ANOVA, see NCHRP Project 20-45, Volume 2, Chapter 4, Section A.) 5. Interpreting the Results: For the data shown in Table 30, the ANOVA reveals that the volume and design factors are statistically significant at the 99.99% confidence level. Furthermore, the interaction between the volume and design factors also is statistically significant at the 99.99% level. The means tests on the design factors show that the quadrant roadway is significantly different from (has a lower overall travel time than) the other designs at the 95% level. The next- best designs overall are the median U-turn and the continuous flow intersection; these are not statistically different from each other at the 95% level. The third tier of designs consists of the conventional and the split, which are statistically different from all others at the 95% level but not from each other. Finally, the jughandle and the superstreet designs are statistically different from each other and from all other designs at the 95% level according to the means test. Through the simulation, the team learns that several designs appear to be more efficient than the conventional design, especially at higher volume levels. From the results at all seven intersections, the team sees that the quadrant roadway and median U-turn designs generally lead to the lowest travel times, especially with the higher volume levels. 6. Conclusion and Discussion: Simulation is an effective tool to analyze traffic operations, as at the seven intersections of interest in this example. No other tool would allow such a robust comparison of many different designs and provide the results for travel times in a larger net- work rather than delays at a single spot. The simulation conducted in this example also allows the team to conduct an efficient factorial design, which maximizes the information provided from the effort. Simulation is a useful tool in research for traffic operations because it • affords the ability to conduct randomized experiments, • allows the examination of details that other methods cannot provide, and • allows the analysis of large and complex networks. In practice, simulation also is popular because of the vivid and realistic animation output provided by common software packages. The superb animations allow analysts to spot and treat flaws in the design or model and provide agencies an effective tool by which to share designs with politicians and the public. Although simulation results can sometimes be surprising, more often they confirm what the analysts already suspect based on simpler analyses. In the example described here, the analysts suspected that the quadrant roadway and median U-turn designs would perform well because these designs had performed well in prior Highway Capacity Manual calculations. In many studies, simulations provide rich detail and vivid animation but no big surprises. 7. Applications in Other Areas of Transportation Research: Simulations are critical analysis methods in several areas of transportation research. Besides traffic operations, simulations are used in research related to: • Maintenance—to model the lifetime performance of traffic signs. • Traffic Safety – to examine vehicle performance and driver behaviors or performance. – to predict the number of collisions from a new roadway design (potentially, given the recent development of the FHWA SSAM program).

examples of effective experiment Design and Data analysis in transportation research 77 Example 21: Traffic Safety; Non-parametric Methods Area: Traffic safety Method of Analysis: Non-parametric methods (methods used when data do not follow assumed or conventional distributions, such as when comparing median values) 1. Research Question/Problem Statement: A city traffic engineer has been receiving many citizen complaints about the perceived lack of safety at unsignalized midblock crosswalks. Apparently, some motorists seem surprised by pedestrians in the crosswalks and do not yield to the pedestrians. The engineer believes that larger and brighter warning signs may be an inexpensive way to enhance safety at these locations. Question/Issue Determine whether some treatment has an effect when data to be tested do not follow known distributions. In this example, a nonparametric method is used to determine whether larger and brighter warning signs improve pedestrian safety at unsignalized midblock crosswalks. The null hypothesis and alternative hypothesis are stated as follows: Ho: There is no difference in the median values of the number of conflicts before and after a treatment. Ha: There is a difference in the median values. 2. Identification and Description of Variables: The engineer would like to collect collision data at crosswalks with improved signs, but it would take a long time at a large sample of crosswalks to collect a reasonable sample size of collisions to answer the question. Instead, the engineer collects data for conflicts, which are near-collisions when one or both of the involved entities brakes or swerves within 2 seconds of a collision to avoid the collision. Research literature has shown that conflicts are related to collisions, and because conflicts are much more numerous than collisions, it is much quicker to collect a good sample size. Conflict data are not nearly as widely used as collision data, however, and the underlying distribution of conflict data is not clear. Thus, the use of non-parametric methods seems appropriate. 3. Data Collection: The engineer identifies seven test crosswalks in the city based on large pedes- trian volumes and the presence of convenient vantage points for observing conflicts. The engi- neering staff collects data on traffic conflicts for 2 full days at each of the seven crosswalks with standard warning signs. The engineer then has larger and brighter warning signs installed at the seven sites. After waiting at least 1 month at each site after sign installation, the staff again collects traffic conflicts for 2 full days, making sure that weather, light, and as many other conditions as possible are similar between the before-and-after data collection periods at each site. 4. Specification of Analysis Technique and Data Analysis: A nonparametric statistical test is an efficient way to analyze data when the underlying distribution is unclear (as in this example using conflict data) and when the sample size is small (as in this example with its small number of sites). Several such tests, such as the sign test and the Wilcoxon signed-rank (Wilcoxon rank-sum) test are plausible in this example. (For more information about nonparametric tests, see NCHRP Project 20-45, Volume 2, Chapter 6, Section D, “Hypothesis About Population Medians for Independent Samples.” ) The decision is made to use the Wilcoxon signed-rank test because it is a more powerful test for paired numerical measurements than other tests, and this example uses paired (before-and-after) measurements. The sign test is a popular nonparametric test for paired data but loses information contained in numerical measurements by reducing the data to a series of positive or negative signs.

78 effective experiment Design and Data analysis in transportation research Having decided on the Wilcoxon signed-rank test, the engineer arranges the data (see Table 31). The third row of the table is the difference between the frequencies of the two conflict measurements at each site. The last row shows the rank order of the sites from lowest to highest based on the absolute value of the difference. Site 3 has the least difference (35 - 33 = 2) while Site 7 has the greatest difference (54 - 61 = -16). The Wilcoxon signed-rank test ranks the differences from low to high in terms of absolute values. In this case, that would be 2, 3, 7, 7, 12, 15, and 16. The test statistic, x, is the sum of the ranks that have positive differences. In this example, x = 1 + 2 + 3.5 + 3.5 + 6 = 16. Notice that all but the sixth and seventh ranked sites had positive differences. Notice also that the tied differences were assigned ranks equal to the average of the ranks they would have received if they were just slightly different from each other. The engineer then consults a table for the Wilcoxon signed-rank test to get a critical value against which to compare. (Such a table appears in NCHRP Project 20-45, Volume 2, Appendix C, Table C-8.) The standard table for a sample size of seven shows that the critical value for a one-tailed test (testing whether there is an improvement) with a confidence level of 95% is x = 24. 5. Interpreting the Results: Because the calculated value (x = 16) is less than the critical value (x = 24), the engineer concludes that there is not a statistically significant difference between the number of conflicts recorded with standard signs and the number of conflicts recorded with larger and brighter signs. 6. Conclusion and Discussion: Nonparametric tests do not require the engineer to make restric- tive assumptions about an underlying distribution and are therefore good choices in cases like this, in which the sample size is small and the data collected do not have a familiar underlying distribution. Many nonparametric tests are available, so analysts should do some reading and searching before settling on the best one for any particular case. Once a nonparametric test is determined, it is usually easy to apply. This example also illustrates one of the potential pitfalls of statistical testing. The engineer’s conclusion is that there is not a statistically significant difference between the number of conflicts recorded with standard signs and the number of conflicts recorded with larger and brighter signs. That conclusion does not necessarily mean that larger and brighter signs are a bad idea at sites similar to those tested. Notice that in this experiment, larger and brighter signs produced lower conflict frequencies at five of the seven sites, and the average number of conflicts per site was lower with the larger and brighter signs. Given that signs are relatively inexpensive, they may be a good idea at sites like those tested. A statistical test can provide useful information, especially about the quality of the experiment, but analysts must be careful not to interpret the results of a statistical test too strictly. In this example, the greatest danger to the validity of the test result lies not in the statistical test but in the underlying before-and-after test setup. For the results to be valid, it is necessary that the only important change that affects conflicts at the test sites during data collection be Site 1 Site 2 Site 3 Site 4 Site 5 Site 6 Site 7 Standard signs 170 39 35 32 32 19 45 Larger and brighter signs 155 26 33 29 25 31 61 Difference 15 7 2 3 7 -12 -16 Rank of absolute difference 6 73.5 1 2 3.5 5 Table 31. Number of conflicts recorded during each (equal) time period at each site.

examples of effective experiment Design and Data analysis in transportation research 79 the new signs. The engineer has kept the duration short between the before-and-after data collection periods, which helps minimize the chances of other important changes. However, if there is any reason to suspect other important changes, these test results should be viewed skeptically and a more sophisticated test strategy should be employed. 7. Applications in Other Areas of Transportation Research: Nonparametric tests are helpful when researchers are working with small sample sizes or sample data wherein the underlying distribution is unknown. Examples of other areas of transportation research in which non- parametric tests may be applied include: • Transportation Planning, Public Transportation—to analyze data from surveys and questionnaires when the scale of the response calls into question the underlying distribution. Such data are often analyzed in transportation planning and public transportation. • Traffic Operations—to analyze small samples of speed or volume data. • Structures, Pavements—to analyze quality ratings of pavements, bridges, and other trans- portation assets. Such ratings also use scales. Resources The examples used in this report have included references to the following resources. Researchers are encouraged to consult these resources for more information about statistical procedures. Freund, R. J. and W. J. Wilson (2003). Statistical Methods. 2d ed. Burlington, MA: Academic Press. See page 256 for a discussion of Tukey’s procedure. Kutner, M. et al. (2005). Applied Linear Statistical Models. 5th ed. Boston: McGraw-Hill. See page 746 for a discussion of Tukey’s procedure. NCHRP CD-22: Scientific Approaches to Transportation Research, Vol. 1 and 2. 2002. Transpor- tation Research Board of the National Academies, Washington, D.C. This two-volume electronic manual developed under NCHRP Project 20-45 provides a comprehensive source of information on the conduct of research. The manual includes state-of-the-art techniques for problem state- ment development; literature searching; development of the research work plan; execution of the experiment; data collection, management, quality control, and reporting of results; and evaluation of the effectiveness of the research, as well as the requirements for the systematic, pro- fessional, and ethical conduct of transportation research. For readers’ convenience, the references to NCHRP Project 20-45 from the various examples contained in this report are summarized here by topic and location in NCHRP CD-22. More information about NCHRP CD-22 is available at http://www.trb.org/Main/Blurbs/152122.aspx. • Analysis of Variance (one-way ANOVA and two-way ANOVA): See Volume 2, Chapter 4, Section A, Analysis of Variance Methodology (pp. 113, 119–31). • Assumptions for residual errors: See Volume 2, Chapter 4. • Box plots; Q-Q plots: See Volume 2, Chapter 6, Section C. • Chi-square test: See Volume 2, Chapter 6, Sections E (Chi-Square Test for Independence) and F. • Chi-square values: See Volume 2, Appendix C, Table C-2. • Computations on unbalanced designs and multi-factorial designs: See Volume 2, Chapter 4, Section A, Analysis of Variance Methodology (pp. 119–31). • Confidence intervals: See Volume 2, Chapter 4. • Correlation coefficient: See Volume 2, Appendix A, Glossary, Correlation Coefficient. • Critical F-value: See Volume 2, Appendix C, Table C-5. • Desirable and undesirable residual plots (scatter plots): See Volume 2, Chapter 4, Section B, Figure 6.

80 effective experiment Design and Data analysis in transportation research • Equation fit: See Volume 2, Chapter 4, Glossary, Descriptive Measures of Association Between X and Y. • Error distributions (normality, constant variance, uncorrelated, etc.): See Volume 2, Chapter 4 (pp. 146–55). • Experiment design and data collection: See Volume 2, Chapter 1. • Fcrit and F-distribution table: See Volume 2, Appendix C, Table C-5. • F-test (or F-test): See Volume 2, Chapter 4, Section A, Compute the F-ratio Test Statistic (p. 124). • Formulation of formal hypotheses for testing: See Volume 1, Chapter 2, Hypothesis; Volume 2, Appendix A, Glossary. • History and maturation biases (specification errors): See Volume 2, Chapter 1, Quasi- Experiments. • Indicator (dummy) variables: See Volume 2, Chapter 4 (pp. 142–45). • Intercept and slope: See Volume 2, Chapter 4 (pp. 140–42). • Maximum likelihood methods: See Volume 2, Chapter 5 (pp. 208–11). • Mean and standard deviation formulas: See Volume 2, Chapter 6, Table C, Frequency Distribu- tions, Variance, Standard Deviation, Histograms, and Boxplots. • Measured ratio or interval scale: See Volume 2, Chapter 1 (p. 83). • Multinomial distribution and polychotomous logistical model: See Volume 2, Chapter 5 (pp. 211–18). • Multiple (multivariate) regression: See Volume 2, Chapter 4, Section B. • Non-parametric tests: See Volume 2, Chapter 6, Section D. • Normal distribution: See Volume 2, Appendix A, Glossary, Normal Distribution. • One- and two-sided hypothesis testing (one- and two-tail test values): See Volume 2, Chapter 4 (pp. 161 and 164–5). • Ordinary least squares (OLS) regression: See Volume 2, Chapter 4, Section B, Linear Regression. • Sample size and confidence: See Volume 2, Chapter 1, Sample Size Determination. • Sample size determination based on statistical power requirements: See Volume 2, Chapter 1, Sample Size Determination (p. 94). • Sign test and the Wilcoxon signed-rank (Wilcoxon rank-sum) test: See Volume 2, Chapter 6, Section D, and Appendix C, Table C-8, Hypothesis About Population Medians for Independent Samples. • Split samples: See Volume 2, Chapter 4, Section A, Analysis of Variance Methodology (pp. 119–31). • Standard chi-square distribution table: See Volume 2, Appendix C, Table C-2. • Standard normal values: See Volume 2, Appendix C, Table C-1. • tcrit values: See Volume 2, Appendix C, Table C-4. • t-statistic: See Volume 2, Appendix A, Glossary. • t-statistic using equation for equal variance: See Volume 2, Appendix C, Table C-4. • t-test: See Volume 2, Chapter 4, Section B, How are t-statistics Interpreted? • Tabularized values of t-statistic: See Volume 2, Appendix C, Table C-4. • Tukey’s test, Bonferroni’s test, Scheffe’s test: See Volume 2, Chapter 4, Section A, Analysis of Variance Methodology (pp. 119–31). • Types of data and implications for selection of analysis techniques: See Volume 2, Chapter 1, Identification of Empirical Setting.

Abbreviations and acronyms used without definitions in TRB publications: AAAE American Association of Airport Executives AASHO American Association of State Highway Officials AASHTO American Association of State Highway and Transportation Officials ACI–NA Airports Council International–North America ACRP Airport Cooperative Research Program ADA Americans with Disabilities Act APTA American Public Transportation Association ASCE American Society of Civil Engineers ASME American Society of Mechanical Engineers ASTM American Society for Testing and Materials ATA American Trucking Associations CTAA Community Transportation Association of America CTBSSP Commercial Truck and Bus Safety Synthesis Program DHS Department of Homeland Security DOE Department of Energy EPA Environmental Protection Agency FAA Federal Aviation Administration FHWA Federal Highway Administration FMCSA Federal Motor Carrier Safety Administration FRA Federal Railroad Administration FTA Federal Transit Administration HMCRP Hazardous Materials Cooperative Research Program IEEE Institute of Electrical and Electronics Engineers ISTEA Intermodal Surface Transportation Efficiency Act of 1991 ITE Institute of Transportation Engineers NASA National Aeronautics and Space Administration NASAO National Association of State Aviation Officials NCFRP National Cooperative Freight Research Program NCHRP National Cooperative Highway Research Program NHTSA National Highway Traffic Safety Administration NTSB National Transportation Safety Board PHMSA Pipeline and Hazardous Materials Safety Administration RITA Research and Innovative Technology Administration SAE Society of Automotive Engineers SAFETEA-LU Safe, Accountable, Flexible, Efficient Transportation Equity Act: A Legacy for Users (2005) TCRP Transit Cooperative Research Program TEA-21 Transportation Equity Act for the 21st Century (1998) TRB Transportation Research Board TSA Transportation Security Administration U.S.DOT United States Department of Transportation

TRB’s National Cooperative Highway Research Program (NCHRP) Report 727: Effective Experiment Design and Data Analysis in Transportation Research describes the factors that may be considered in designing experiments and presents 21 typical transportation examples illustrating the experiment design process, including selection of appropriate statistical tests.

The report is a companion to NCHRP CD-22, Scientific Approaches to Transportation Research, Volumes 1 and 2 , which present detailed information on statistical methods.

READ FREE ONLINE

Welcome to OpenBook!

You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

Do you want to take a quick tour of the OpenBook's features?

Show this book's table of contents , where you can jump to any chapter by name.

...or use these buttons to go back to the previous chapter or skip to the next one.

Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

To search the entire text of this book, type in your search term here and press Enter .

Share a link to this book page on your preferred social network or via email.

View our suggested citation for this chapter.

Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

Get Email Updates

Do you enjoy reading reports from the Academies online for free ? Sign up for email notifications and we'll let you know about new publications in your areas of interest when they're released.

8 Types of Data Analysis

The different types of data analysis include descriptive, diagnostic, exploratory, inferential, predictive, causal, mechanistic and prescriptive. Here’s what you need to know about each one.

Benedict Neo

Data analysis is an aspect of data science and  data analytics that is all about analyzing data for different kinds of purposes. The data analysis process involves inspecting, cleaning, transforming and  modeling data to draw useful insights from it.

Types of Data Analysis

  • Descriptive analysis
  • Diagnostic analysis
  • Exploratory analysis
  • Inferential analysis
  • Predictive analysis
  • Causal analysis
  • Mechanistic analysis
  • Prescriptive analysis

With its multiple facets, methodologies and techniques, data analysis is used in a variety of fields, including energy, healthcare and marketing, among others. As businesses thrive under the influence of technological advancements in data analytics, data analysis plays a huge role in decision-making , providing a better, faster and more effective system that minimizes risks and reduces human biases .

That said, there are different kinds of data analysis with different goals. We’ll examine each one below.

Two Camps of Data Analysis

Data analysis can be divided into two camps, according to the book R for Data Science :

  • Hypothesis Generation: This involves looking deeply at the data and combining your domain knowledge to generate  hypotheses about why the data behaves the way it does.
  • Hypothesis Confirmation: This involves using a precise mathematical model to generate falsifiable predictions with statistical sophistication to confirm your prior hypotheses.

More on Data Analysis: Data Analyst vs. Data Scientist: Similarities and Differences Explained

Data analysis can be separated and organized into types, arranged in an increasing order of complexity.  

1. Descriptive Analysis

The goal of descriptive analysis is to describe or summarize a set of data . Here’s what you need to know:

  • Descriptive analysis is the very first analysis performed in the data analysis process.
  • It generates simple summaries of samples and measurements.
  • It involves common, descriptive statistics like measures of central tendency, variability, frequency and position.

Descriptive Analysis Example

Take the Covid-19 statistics page on Google, for example. The line graph is a pure summary of the cases/deaths, a presentation and description of the population of a particular country infected by the virus.

Descriptive analysis is the first step in analysis where you summarize and describe the data you have using descriptive statistics, and the result is a simple presentation of your data.

2. Diagnostic Analysis  

Diagnostic analysis seeks to answer the question “Why did this happen?” by taking a more in-depth look at data to uncover subtle patterns. Here’s what you need to know:

  • Diagnostic analysis typically comes after descriptive analysis, taking initial findings and investigating why certain patterns in data happen. 
  • Diagnostic analysis may involve analyzing other related data sources, including past data, to reveal more insights into current data trends.  
  • Diagnostic analysis is ideal for further exploring patterns in data to explain anomalies .  

Diagnostic Analysis Example

A footwear store wants to review its  website traffic levels over the previous 12 months. Upon compiling and assessing the data, the company’s marketing team finds that June experienced above-average levels of traffic while July and August witnessed slightly lower levels of traffic. 

To find out why this difference occurred, the marketing team takes a deeper look. Team members break down the data to focus on specific categories of footwear. In the month of June, they discovered that pages featuring sandals and other beach-related footwear received a high number of views while these numbers dropped in July and August. 

Marketers may also review other factors like seasonal changes and company sales events to see if other variables could have contributed to this trend.    

3. Exploratory Analysis (EDA)

Exploratory analysis involves examining or  exploring data and finding relationships between variables that were previously unknown. Here’s what you need to know:

  • EDA helps you discover relationships between measures in your data, which are not evidence for the existence of the correlation, as denoted by the phrase, “ Correlation doesn’t imply causation .”
  • It’s useful for discovering new connections and forming hypotheses. It drives design planning and data collection .

Exploratory Analysis Example

Climate change is an increasingly important topic as the global temperature has gradually risen over the years. One example of an exploratory data analysis on climate change involves taking the rise in temperature over the years from 1950 to 2020 and the increase of human activities and industrialization to find relationships from the data. For example, you may increase the number of factories, cars on the road and airplane flights to see how that correlates with the rise in temperature.

Exploratory analysis explores data to find relationships between measures without identifying the cause. It’s most useful when formulating hypotheses. 

4. Inferential Analysis

Inferential analysis involves using a small sample of data to infer information about a larger population of data.

The goal of statistical modeling itself is all about using a small amount of information to extrapolate and generalize information to a larger group. Here’s what you need to know:

  • Inferential analysis involves using estimated data that is representative of a population and gives a measure of uncertainty or  standard deviation to your estimation.
  • The accuracy of inference depends heavily on your sampling scheme. If the sample isn’t representative of the population, the generalization will be inaccurate. This is known as the central limit theorem .

Inferential Analysis Example

A psychological study on the benefits of sleep might have a total of 500 people involved. When they followed up with the candidates, the candidates reported to have better overall attention spans and well-being with seven to nine hours of sleep, while those with less sleep and more sleep than the given range suffered from reduced attention spans and energy. This study drawn from 500 people was just a tiny portion of the 7 billion people in the world, and is thus an inference of the larger population.

Inferential analysis extrapolates and generalizes the information of the larger group with a smaller sample to generate analysis and predictions. 

5. Predictive Analysis

Predictive analysis involves using historical or current data to find patterns and make predictions about the future. Here’s what you need to know:

  • The accuracy of the predictions depends on the input variables.
  • Accuracy also depends on the types of models. A linear model might work well in some cases, and in other cases it might not.
  • Using a variable to predict another one doesn’t denote a causal relationship.

Predictive Analysis Example

The 2020 United States election is a popular topic and many prediction models are built to predict the winning candidate. FiveThirtyEight did this to forecast the 2016 and 2020 elections. Prediction analysis for an election would require input variables such as historical polling data, trends and current polling data in order to return a good prediction. Something as large as an election wouldn’t just be using a linear model, but a complex model with certain tunings to best serve its purpose.

6. Causal Analysis

Causal analysis looks at the cause and effect of relationships between variables and is focused on finding the cause of a correlation. This way, researchers can examine how a change in one variable affects another. Here’s what you need to know:

  • To find the cause, you have to question whether the observed correlations driving your conclusion are valid. Just looking at the surface data won’t help you discover the hidden mechanisms underlying the correlations.
  • Causal analysis is applied in randomized studies focused on identifying causation.
  • Causal analysis is the gold standard in data analysis and scientific studies where the cause of a phenomenon is to be extracted and singled out, like separating wheat from chaff.
  • Good data is hard to find and requires expensive research and studies. These studies are analyzed in aggregate (multiple groups), and the observed relationships are just average effects (mean) of the whole population. This means the results might not apply to everyone.

Causal Analysis Example  

Say you want to test out whether a new drug improves human strength and focus. To do that, you perform randomized control trials for the drug to test its effect. You compare the sample of candidates for your new drug against the candidates receiving a mock control drug through a few tests focused on strength and overall focus and attention. This will allow you to observe how the drug affects the outcome. 

7. Mechanistic Analysis

Mechanistic analysis is used to understand exact changes in variables that lead to other changes in other variables . In some ways, it is a predictive analysis, but it’s modified to tackle studies that require high precision and meticulous methodologies for physical or engineering science. Here’s what you need to know:

  • It’s applied in physical or engineering sciences, situations that require high  precision and little room for error, only noise in data is measurement error.
  • It’s designed to understand a biological or behavioral process, the pathophysiology of a disease or the mechanism of action of an intervention. 

Mechanistic Analysis Example

Say an experiment is done to simulate safe and effective nuclear fusion to power the world. A mechanistic analysis of the study would entail a precise balance of controlling and manipulating variables with highly accurate measures of both variables and the desired outcomes. It’s this intricate and meticulous modus operandi toward these big topics that allows for scientific breakthroughs and advancement of society.

8. Prescriptive Analysis  

Prescriptive analysis compiles insights from other previous data analyses and determines actions that teams or companies can take to prepare for predicted trends. Here’s what you need to know: 

  • Prescriptive analysis may come right after predictive analysis, but it may involve combining many different data analyses. 
  • Companies need advanced technology and plenty of resources to conduct prescriptive analysis. Artificial intelligence systems that process data and adjust automated tasks are an example of the technology required to perform prescriptive analysis.  

Prescriptive Analysis Example

Prescriptive analysis is pervasive in everyday life, driving the curated content users consume on social media. On platforms like TikTok and Instagram,  algorithms can apply prescriptive analysis to review past content a user has engaged with and the kinds of behaviors they exhibited with specific posts. Based on these factors, an  algorithm seeks out similar content that is likely to elicit the same response and  recommends it on a user’s personal feed. 

More on Data Explaining the Empirical Rule for Normal Distribution

When to Use the Different Types of Data Analysis  

  • Descriptive analysis summarizes the data at hand and presents your data in a comprehensible way.
  • Diagnostic analysis takes a more detailed look at data to reveal why certain patterns occur, making it a good method for explaining anomalies. 
  • Exploratory data analysis helps you discover correlations and relationships between variables in your data.
  • Inferential analysis is for generalizing the larger population with a smaller sample size of data.
  • Predictive analysis helps you make predictions about the future with data.
  • Causal analysis emphasizes finding the cause of a correlation between variables.
  • Mechanistic analysis is for measuring the exact changes in variables that lead to other changes in other variables.
  • Prescriptive analysis combines insights from different data analyses to develop a course of action teams and companies can take to capitalize on predicted outcomes. 

A few important tips to remember about data analysis include:

  • Correlation doesn’t imply causation.
  • EDA helps discover new connections and form hypotheses.
  • Accuracy of inference depends on the sampling scheme.
  • A good prediction depends on the right input variables.
  • A simple linear model with enough data usually does the trick.
  • Using a variable to predict another doesn’t denote causal relationships.
  • Good data is hard to find, and to produce it requires expensive research.
  • Results from studies are done in aggregate and are average effects and might not apply to everyone.​

Frequently Asked Questions

What is an example of data analysis.

A marketing team reviews a company’s web traffic over the past 12 months. To understand why sales rise and fall during certain months, the team breaks down the data to look at shoe type, seasonal patterns and sales events. Based on this in-depth analysis, the team can determine variables that influenced web traffic and make adjustments as needed.

How do you know which data analysis method to use?

Selecting a data analysis method depends on the goals of the analysis and the complexity of the task, among other factors. It’s best to assess the circumstances and consider the pros and cons of each type of data analysis before moving forward with a particular method.

Recent Data Science Articles

58 Examples of Artificial Intelligence in Business

Analyst Answers

Data & Finance for Work & Life

data analysis types, methods, and techniques tree diagram

Data Analysis: Types, Methods & Techniques (a Complete List)

( Updated Version )

While the term sounds intimidating, “data analysis” is nothing more than making sense of information in a table. It consists of filtering, sorting, grouping, and manipulating data tables with basic algebra and statistics.

In fact, you don’t need experience to understand the basics. You have already worked with data extensively in your life, and “analysis” is nothing more than a fancy word for good sense and basic logic.

Over time, people have intuitively categorized the best logical practices for treating data. These categories are what we call today types , methods , and techniques .

This article provides a comprehensive list of types, methods, and techniques, and explains the difference between them.

For a practical intro to data analysis (including types, methods, & techniques), check out our Intro to Data Analysis eBook for free.

Descriptive, Diagnostic, Predictive, & Prescriptive Analysis

If you Google “types of data analysis,” the first few results will explore descriptive , diagnostic , predictive , and prescriptive analysis. Why? Because these names are easy to understand and are used a lot in “the real world.”

Descriptive analysis is an informational method, diagnostic analysis explains “why” a phenomenon occurs, predictive analysis seeks to forecast the result of an action, and prescriptive analysis identifies solutions to a specific problem.

That said, these are only four branches of a larger analytical tree.

Good data analysts know how to position these four types within other analytical methods and tactics, allowing them to leverage strengths and weaknesses in each to uproot the most valuable insights.

Let’s explore the full analytical tree to understand how to appropriately assess and apply these four traditional types.

Tree diagram of Data Analysis Types, Methods, and Techniques

Here’s a picture to visualize the structure and hierarchy of data analysis types, methods, and techniques.

If it’s too small you can view the picture in a new tab . Open it to follow along!

data analysis in experimental research

Note: basic descriptive statistics such as mean , median , and mode , as well as standard deviation , are not shown because most people are already familiar with them. In the diagram, they would fall under the “descriptive” analysis type.

Tree Diagram Explained

The highest-level classification of data analysis is quantitative vs qualitative . Quantitative implies numbers while qualitative implies information other than numbers.

Quantitative data analysis then splits into mathematical analysis and artificial intelligence (AI) analysis . Mathematical types then branch into descriptive , diagnostic , predictive , and prescriptive .

Methods falling under mathematical analysis include clustering , classification , forecasting , and optimization . Qualitative data analysis methods include content analysis , narrative analysis , discourse analysis , framework analysis , and/or grounded theory .

Moreover, mathematical techniques include regression , Nïave Bayes , Simple Exponential Smoothing , cohorts , factors , linear discriminants , and more, whereas techniques falling under the AI type include artificial neural networks , decision trees , evolutionary programming , and fuzzy logic . Techniques under qualitative analysis include text analysis , coding , idea pattern analysis , and word frequency .

It’s a lot to remember! Don’t worry, once you understand the relationship and motive behind all these terms, it’ll be like riding a bike.

We’ll move down the list from top to bottom and I encourage you to open the tree diagram above in a new tab so you can follow along .

But first, let’s just address the elephant in the room: what’s the difference between methods and techniques anyway?

Difference between methods and techniques

Though often used interchangeably, methods ands techniques are not the same. By definition, methods are the process by which techniques are applied, and techniques are the practical application of those methods.

For example, consider driving. Methods include staying in your lane, stopping at a red light, and parking in a spot. Techniques include turning the steering wheel, braking, and pushing the gas pedal.

Data sets: observations and fields

It’s important to understand the basic structure of data tables to comprehend the rest of the article. A data set consists of one far-left column containing observations, then a series of columns containing the fields (aka “traits” or “characteristics”) that describe each observations. For example, imagine we want a data table for fruit. It might look like this:

The fruit (observation) (field1)Avg. diameter (field 2)Avg. time to eat (field 3)
Watermelon20 lbs (9 kg)16 inch (40 cm)20 minutes
Apple.33 lbs (.15 kg)4 inch (8 cm)5 minutes
Orange.30 lbs (.14 kg)4 inch (8 cm)5 minutes

Now let’s turn to types, methods, and techniques. Each heading below consists of a description, relative importance, the nature of data it explores, and the motivation for using it.

Quantitative Analysis

  • It accounts for more than 50% of all data analysis and is by far the most widespread and well-known type of data analysis.
  • As you have seen, it holds descriptive, diagnostic, predictive, and prescriptive methods, which in turn hold some of the most important techniques available today, such as clustering and forecasting.
  • It can be broken down into mathematical and AI analysis.
  • Importance : Very high . Quantitative analysis is a must for anyone interesting in becoming or improving as a data analyst.
  • Nature of Data: data treated under quantitative analysis is, quite simply, quantitative. It encompasses all numeric data.
  • Motive: to extract insights. (Note: we’re at the top of the pyramid, this gets more insightful as we move down.)

Qualitative Analysis

  • It accounts for less than 30% of all data analysis and is common in social sciences .
  • It can refer to the simple recognition of qualitative elements, which is not analytic in any way, but most often refers to methods that assign numeric values to non-numeric data for analysis.
  • Because of this, some argue that it’s ultimately a quantitative type.
  • Importance: Medium. In general, knowing qualitative data analysis is not common or even necessary for corporate roles. However, for researchers working in social sciences, its importance is very high .
  • Nature of Data: data treated under qualitative analysis is non-numeric. However, as part of the analysis, analysts turn non-numeric data into numbers, at which point many argue it is no longer qualitative analysis.
  • Motive: to extract insights. (This will be more important as we move down the pyramid.)

Mathematical Analysis

  • Description: mathematical data analysis is a subtype of qualitative data analysis that designates methods and techniques based on statistics, algebra, and logical reasoning to extract insights. It stands in opposition to artificial intelligence analysis.
  • Importance: Very High. The most widespread methods and techniques fall under mathematical analysis. In fact, it’s so common that many people use “quantitative” and “mathematical” analysis interchangeably.
  • Nature of Data: numeric. By definition, all data under mathematical analysis are numbers.
  • Motive: to extract measurable insights that can be used to act upon.

Artificial Intelligence & Machine Learning Analysis

  • Description: artificial intelligence and machine learning analyses designate techniques based on the titular skills. They are not traditionally mathematical, but they are quantitative since they use numbers. Applications of AI & ML analysis techniques are developing, but they’re not yet mainstream enough to show promise across the field.
  • Importance: Medium . As of today (September 2020), you don’t need to be fluent in AI & ML data analysis to be a great analyst. BUT, if it’s a field that interests you, learn it. Many believe that in 10 year’s time its importance will be very high .
  • Nature of Data: numeric.
  • Motive: to create calculations that build on themselves in order and extract insights without direct input from a human.

Descriptive Analysis

  • Description: descriptive analysis is a subtype of mathematical data analysis that uses methods and techniques to provide information about the size, dispersion, groupings, and behavior of data sets. This may sounds complicated, but just think about mean, median, and mode: all three are types of descriptive analysis. They provide information about the data set. We’ll look at specific techniques below.
  • Importance: Very high. Descriptive analysis is among the most commonly used data analyses in both corporations and research today.
  • Nature of Data: the nature of data under descriptive statistics is sets. A set is simply a collection of numbers that behaves in predictable ways. Data reflects real life, and there are patterns everywhere to be found. Descriptive analysis describes those patterns.
  • Motive: the motive behind descriptive analysis is to understand how numbers in a set group together, how far apart they are from each other, and how often they occur. As with most statistical analysis, the more data points there are, the easier it is to describe the set.

Diagnostic Analysis

  • Description: diagnostic analysis answers the question “why did it happen?” It is an advanced type of mathematical data analysis that manipulates multiple techniques, but does not own any single one. Analysts engage in diagnostic analysis when they try to explain why.
  • Importance: Very high. Diagnostics are probably the most important type of data analysis for people who don’t do analysis because they’re valuable to anyone who’s curious. They’re most common in corporations, as managers often only want to know the “why.”
  • Nature of Data : data under diagnostic analysis are data sets. These sets in themselves are not enough under diagnostic analysis. Instead, the analyst must know what’s behind the numbers in order to explain “why.” That’s what makes diagnostics so challenging yet so valuable.
  • Motive: the motive behind diagnostics is to diagnose — to understand why.

Predictive Analysis

  • Description: predictive analysis uses past data to project future data. It’s very often one of the first kinds of analysis new researchers and corporate analysts use because it is intuitive. It is a subtype of the mathematical type of data analysis, and its three notable techniques are regression, moving average, and exponential smoothing.
  • Importance: Very high. Predictive analysis is critical for any data analyst working in a corporate environment. Companies always want to know what the future will hold — especially for their revenue.
  • Nature of Data: Because past and future imply time, predictive data always includes an element of time. Whether it’s minutes, hours, days, months, or years, we call this time series data . In fact, this data is so important that I’ll mention it twice so you don’t forget: predictive analysis uses time series data .
  • Motive: the motive for investigating time series data with predictive analysis is to predict the future in the most analytical way possible.

Prescriptive Analysis

  • Description: prescriptive analysis is a subtype of mathematical analysis that answers the question “what will happen if we do X?” It’s largely underestimated in the data analysis world because it requires diagnostic and descriptive analyses to be done before it even starts. More than simple predictive analysis, prescriptive analysis builds entire data models to show how a simple change could impact the ensemble.
  • Importance: High. Prescriptive analysis is most common under the finance function in many companies. Financial analysts use it to build a financial model of the financial statements that show how that data will change given alternative inputs.
  • Nature of Data: the nature of data in prescriptive analysis is data sets. These data sets contain patterns that respond differently to various inputs. Data that is useful for prescriptive analysis contains correlations between different variables. It’s through these correlations that we establish patterns and prescribe action on this basis. This analysis cannot be performed on data that exists in a vacuum — it must be viewed on the backdrop of the tangibles behind it.
  • Motive: the motive for prescriptive analysis is to establish, with an acceptable degree of certainty, what results we can expect given a certain action. As you might expect, this necessitates that the analyst or researcher be aware of the world behind the data, not just the data itself.

Clustering Method

  • Description: the clustering method groups data points together based on their relativeness closeness to further explore and treat them based on these groupings. There are two ways to group clusters: intuitively and statistically (or K-means).
  • Importance: Very high. Though most corporate roles group clusters intuitively based on management criteria, a solid understanding of how to group them mathematically is an excellent descriptive and diagnostic approach to allow for prescriptive analysis thereafter.
  • Nature of Data : the nature of data useful for clustering is sets with 1 or more data fields. While most people are used to looking at only two dimensions (x and y), clustering becomes more accurate the more fields there are.
  • Motive: the motive for clustering is to understand how data sets group and to explore them further based on those groups.
  • Here’s an example set:

data analysis in experimental research

Classification Method

  • Description: the classification method aims to separate and group data points based on common characteristics . This can be done intuitively or statistically.
  • Importance: High. While simple on the surface, classification can become quite complex. It’s very valuable in corporate and research environments, but can feel like its not worth the work. A good analyst can execute it quickly to deliver results.
  • Nature of Data: the nature of data useful for classification is data sets. As we will see, it can be used on qualitative data as well as quantitative. This method requires knowledge of the substance behind the data, not just the numbers themselves.
  • Motive: the motive for classification is group data not based on mathematical relationships (which would be clustering), but by predetermined outputs. This is why it’s less useful for diagnostic analysis, and more useful for prescriptive analysis.

Forecasting Method

  • Description: the forecasting method uses time past series data to forecast the future.
  • Importance: Very high. Forecasting falls under predictive analysis and is arguably the most common and most important method in the corporate world. It is less useful in research, which prefers to understand the known rather than speculate about the future.
  • Nature of Data: data useful for forecasting is time series data, which, as we’ve noted, always includes a variable of time.
  • Motive: the motive for the forecasting method is the same as that of prescriptive analysis: the confidently estimate future values.

Optimization Method

  • Description: the optimization method maximized or minimizes values in a set given a set of criteria. It is arguably most common in prescriptive analysis. In mathematical terms, it is maximizing or minimizing a function given certain constraints.
  • Importance: Very high. The idea of optimization applies to more analysis types than any other method. In fact, some argue that it is the fundamental driver behind data analysis. You would use it everywhere in research and in a corporation.
  • Nature of Data: the nature of optimizable data is a data set of at least two points.
  • Motive: the motive behind optimization is to achieve the best result possible given certain conditions.

Content Analysis Method

  • Description: content analysis is a method of qualitative analysis that quantifies textual data to track themes across a document. It’s most common in academic fields and in social sciences, where written content is the subject of inquiry.
  • Importance: High. In a corporate setting, content analysis as such is less common. If anything Nïave Bayes (a technique we’ll look at below) is the closest corporations come to text. However, it is of the utmost importance for researchers. If you’re a researcher, check out this article on content analysis .
  • Nature of Data: data useful for content analysis is textual data.
  • Motive: the motive behind content analysis is to understand themes expressed in a large text

Narrative Analysis Method

  • Description: narrative analysis is a method of qualitative analysis that quantifies stories to trace themes in them. It’s differs from content analysis because it focuses on stories rather than research documents, and the techniques used are slightly different from those in content analysis (very nuances and outside the scope of this article).
  • Importance: Low. Unless you are highly specialized in working with stories, narrative analysis rare.
  • Nature of Data: the nature of the data useful for the narrative analysis method is narrative text.
  • Motive: the motive for narrative analysis is to uncover hidden patterns in narrative text.

Discourse Analysis Method

  • Description: the discourse analysis method falls under qualitative analysis and uses thematic coding to trace patterns in real-life discourse. That said, real-life discourse is oral, so it must first be transcribed into text.
  • Importance: Low. Unless you are focused on understand real-world idea sharing in a research setting, this kind of analysis is less common than the others on this list.
  • Nature of Data: the nature of data useful in discourse analysis is first audio files, then transcriptions of those audio files.
  • Motive: the motive behind discourse analysis is to trace patterns of real-world discussions. (As a spooky sidenote, have you ever felt like your phone microphone was listening to you and making reading suggestions? If it was, the method was discourse analysis.)

Framework Analysis Method

  • Description: the framework analysis method falls under qualitative analysis and uses similar thematic coding techniques to content analysis. However, where content analysis aims to discover themes, framework analysis starts with a framework and only considers elements that fall in its purview.
  • Importance: Low. As with the other textual analysis methods, framework analysis is less common in corporate settings. Even in the world of research, only some use it. Strangely, it’s very common for legislative and political research.
  • Nature of Data: the nature of data useful for framework analysis is textual.
  • Motive: the motive behind framework analysis is to understand what themes and parts of a text match your search criteria.

Grounded Theory Method

  • Description: the grounded theory method falls under qualitative analysis and uses thematic coding to build theories around those themes.
  • Importance: Low. Like other qualitative analysis techniques, grounded theory is less common in the corporate world. Even among researchers, you would be hard pressed to find many using it. Though powerful, it’s simply too rare to spend time learning.
  • Nature of Data: the nature of data useful in the grounded theory method is textual.
  • Motive: the motive of grounded theory method is to establish a series of theories based on themes uncovered from a text.

Clustering Technique: K-Means

  • Description: k-means is a clustering technique in which data points are grouped in clusters that have the closest means. Though not considered AI or ML, it inherently requires the use of supervised learning to reevaluate clusters as data points are added. Clustering techniques can be used in diagnostic, descriptive, & prescriptive data analyses.
  • Importance: Very important. If you only take 3 things from this article, k-means clustering should be part of it. It is useful in any situation where n observations have multiple characteristics and we want to put them in groups.
  • Nature of Data: the nature of data is at least one characteristic per observation, but the more the merrier.
  • Motive: the motive for clustering techniques such as k-means is to group observations together and either understand or react to them.

Regression Technique

  • Description: simple and multivariable regressions use either one independent variable or combination of multiple independent variables to calculate a correlation to a single dependent variable using constants. Regressions are almost synonymous with correlation today.
  • Importance: Very high. Along with clustering, if you only take 3 things from this article, regression techniques should be part of it. They’re everywhere in corporate and research fields alike.
  • Nature of Data: the nature of data used is regressions is data sets with “n” number of observations and as many variables as are reasonable. It’s important, however, to distinguish between time series data and regression data. You cannot use regressions or time series data without accounting for time. The easier way is to use techniques under the forecasting method.
  • Motive: The motive behind regression techniques is to understand correlations between independent variable(s) and a dependent one.

Nïave Bayes Technique

  • Description: Nïave Bayes is a classification technique that uses simple probability to classify items based previous classifications. In plain English, the formula would be “the chance that thing with trait x belongs to class c depends on (=) the overall chance of trait x belonging to class c, multiplied by the overall chance of class c, divided by the overall chance of getting trait x.” As a formula, it’s P(c|x) = P(x|c) * P(c) / P(x).
  • Importance: High. Nïave Bayes is a very common, simplistic classification techniques because it’s effective with large data sets and it can be applied to any instant in which there is a class. Google, for example, might use it to group webpages into groups for certain search engine queries.
  • Nature of Data: the nature of data for Nïave Bayes is at least one class and at least two traits in a data set.
  • Motive: the motive behind Nïave Bayes is to classify observations based on previous data. It’s thus considered part of predictive analysis.

Cohorts Technique

  • Description: cohorts technique is a type of clustering method used in behavioral sciences to separate users by common traits. As with clustering, it can be done intuitively or mathematically, the latter of which would simply be k-means.
  • Importance: Very high. With regard to resembles k-means, the cohort technique is more of a high-level counterpart. In fact, most people are familiar with it as a part of Google Analytics. It’s most common in marketing departments in corporations, rather than in research.
  • Nature of Data: the nature of cohort data is data sets in which users are the observation and other fields are used as defining traits for each cohort.
  • Motive: the motive for cohort analysis techniques is to group similar users and analyze how you retain them and how the churn.

Factor Technique

  • Description: the factor analysis technique is a way of grouping many traits into a single factor to expedite analysis. For example, factors can be used as traits for Nïave Bayes classifications instead of more general fields.
  • Importance: High. While not commonly employed in corporations, factor analysis is hugely valuable. Good data analysts use it to simplify their projects and communicate them more clearly.
  • Nature of Data: the nature of data useful in factor analysis techniques is data sets with a large number of fields on its observations.
  • Motive: the motive for using factor analysis techniques is to reduce the number of fields in order to more quickly analyze and communicate findings.

Linear Discriminants Technique

  • Description: linear discriminant analysis techniques are similar to regressions in that they use one or more independent variable to determine a dependent variable; however, the linear discriminant technique falls under a classifier method since it uses traits as independent variables and class as a dependent variable. In this way, it becomes a classifying method AND a predictive method.
  • Importance: High. Though the analyst world speaks of and uses linear discriminants less commonly, it’s a highly valuable technique to keep in mind as you progress in data analysis.
  • Nature of Data: the nature of data useful for the linear discriminant technique is data sets with many fields.
  • Motive: the motive for using linear discriminants is to classify observations that would be otherwise too complex for simple techniques like Nïave Bayes.

Exponential Smoothing Technique

  • Description: exponential smoothing is a technique falling under the forecasting method that uses a smoothing factor on prior data in order to predict future values. It can be linear or adjusted for seasonality. The basic principle behind exponential smoothing is to use a percent weight (value between 0 and 1 called alpha) on more recent values in a series and a smaller percent weight on less recent values. The formula is f(x) = current period value * alpha + previous period value * 1-alpha.
  • Importance: High. Most analysts still use the moving average technique (covered next) for forecasting, though it is less efficient than exponential moving, because it’s easy to understand. However, good analysts will have exponential smoothing techniques in their pocket to increase the value of their forecasts.
  • Nature of Data: the nature of data useful for exponential smoothing is time series data . Time series data has time as part of its fields .
  • Motive: the motive for exponential smoothing is to forecast future values with a smoothing variable.

Moving Average Technique

  • Description: the moving average technique falls under the forecasting method and uses an average of recent values to predict future ones. For example, to predict rainfall in April, you would take the average of rainfall from January to March. It’s simple, yet highly effective.
  • Importance: Very high. While I’m personally not a huge fan of moving averages due to their simplistic nature and lack of consideration for seasonality, they’re the most common forecasting technique and therefore very important.
  • Nature of Data: the nature of data useful for moving averages is time series data .
  • Motive: the motive for moving averages is to predict future values is a simple, easy-to-communicate way.

Neural Networks Technique

  • Description: neural networks are a highly complex artificial intelligence technique that replicate a human’s neural analysis through a series of hyper-rapid computations and comparisons that evolve in real time. This technique is so complex that an analyst must use computer programs to perform it.
  • Importance: Medium. While the potential for neural networks is theoretically unlimited, it’s still little understood and therefore uncommon. You do not need to know it by any means in order to be a data analyst.
  • Nature of Data: the nature of data useful for neural networks is data sets of astronomical size, meaning with 100s of 1000s of fields and the same number of row at a minimum .
  • Motive: the motive for neural networks is to understand wildly complex phenomenon and data to thereafter act on it.

Decision Tree Technique

  • Description: the decision tree technique uses artificial intelligence algorithms to rapidly calculate possible decision pathways and their outcomes on a real-time basis. It’s so complex that computer programs are needed to perform it.
  • Importance: Medium. As with neural networks, decision trees with AI are too little understood and are therefore uncommon in corporate and research settings alike.
  • Nature of Data: the nature of data useful for the decision tree technique is hierarchical data sets that show multiple optional fields for each preceding field.
  • Motive: the motive for decision tree techniques is to compute the optimal choices to make in order to achieve a desired result.

Evolutionary Programming Technique

  • Description: the evolutionary programming technique uses a series of neural networks, sees how well each one fits a desired outcome, and selects only the best to test and retest. It’s called evolutionary because is resembles the process of natural selection by weeding out weaker options.
  • Importance: Medium. As with the other AI techniques, evolutionary programming just isn’t well-understood enough to be usable in many cases. It’s complexity also makes it hard to explain in corporate settings and difficult to defend in research settings.
  • Nature of Data: the nature of data in evolutionary programming is data sets of neural networks, or data sets of data sets.
  • Motive: the motive for using evolutionary programming is similar to decision trees: understanding the best possible option from complex data.
  • Video example :

Fuzzy Logic Technique

  • Description: fuzzy logic is a type of computing based on “approximate truths” rather than simple truths such as “true” and “false.” It is essentially two tiers of classification. For example, to say whether “Apples are good,” you need to first classify that “Good is x, y, z.” Only then can you say apples are good. Another way to see it helping a computer see truth like humans do: “definitely true, probably true, maybe true, probably false, definitely false.”
  • Importance: Medium. Like the other AI techniques, fuzzy logic is uncommon in both research and corporate settings, which means it’s less important in today’s world.
  • Nature of Data: the nature of fuzzy logic data is huge data tables that include other huge data tables with a hierarchy including multiple subfields for each preceding field.
  • Motive: the motive of fuzzy logic to replicate human truth valuations in a computer is to model human decisions based on past data. The obvious possible application is marketing.

Text Analysis Technique

  • Description: text analysis techniques fall under the qualitative data analysis type and use text to extract insights.
  • Importance: Medium. Text analysis techniques, like all the qualitative analysis type, are most valuable for researchers.
  • Nature of Data: the nature of data useful in text analysis is words.
  • Motive: the motive for text analysis is to trace themes in a text across sets of very long documents, such as books.

Coding Technique

  • Description: the coding technique is used in textual analysis to turn ideas into uniform phrases and analyze the number of times and the ways in which those ideas appear. For this reason, some consider it a quantitative technique as well. You can learn more about coding and the other qualitative techniques here .
  • Importance: Very high. If you’re a researcher working in social sciences, coding is THE analysis techniques, and for good reason. It’s a great way to add rigor to analysis. That said, it’s less common in corporate settings.
  • Nature of Data: the nature of data useful for coding is long text documents.
  • Motive: the motive for coding is to make tracing ideas on paper more than an exercise of the mind by quantifying it and understanding is through descriptive methods.

Idea Pattern Technique

  • Description: the idea pattern analysis technique fits into coding as the second step of the process. Once themes and ideas are coded, simple descriptive analysis tests may be run. Some people even cluster the ideas!
  • Importance: Very high. If you’re a researcher, idea pattern analysis is as important as the coding itself.
  • Nature of Data: the nature of data useful for idea pattern analysis is already coded themes.
  • Motive: the motive for the idea pattern technique is to trace ideas in otherwise unmanageably-large documents.

Word Frequency Technique

  • Description: word frequency is a qualitative technique that stands in opposition to coding and uses an inductive approach to locate specific words in a document in order to understand its relevance. Word frequency is essentially the descriptive analysis of qualitative data because it uses stats like mean, median, and mode to gather insights.
  • Importance: High. As with the other qualitative approaches, word frequency is very important in social science research, but less so in corporate settings.
  • Nature of Data: the nature of data useful for word frequency is long, informative documents.
  • Motive: the motive for word frequency is to locate target words to determine the relevance of a document in question.

Types of data analysis in research

Types of data analysis in research methodology include every item discussed in this article. As a list, they are:

  • Quantitative
  • Qualitative
  • Mathematical
  • Machine Learning and AI
  • Descriptive
  • Prescriptive
  • Classification
  • Forecasting
  • Optimization
  • Grounded theory
  • Artificial Neural Networks
  • Decision Trees
  • Evolutionary Programming
  • Fuzzy Logic
  • Text analysis
  • Idea Pattern Analysis
  • Word Frequency Analysis
  • Nïave Bayes
  • Exponential smoothing
  • Moving average
  • Linear discriminant

Types of data analysis in qualitative research

As a list, the types of data analysis in qualitative research are the following methods:

Types of data analysis in quantitative research

As a list, the types of data analysis in quantitative research are:

Data analysis methods

As a list, data analysis methods are:

  • Content (qualitative)
  • Narrative (qualitative)
  • Discourse (qualitative)
  • Framework (qualitative)
  • Grounded theory (qualitative)

Quantitative data analysis methods

As a list, quantitative data analysis methods are:

Tabular View of Data Analysis Types, Methods, and Techniques

Types (Numeric or Non-numeric)Quantitative
Qualitative
Types tier 2 (Traditional Numeric or New Numeric)Mathematical
Artificial Intelligence (AI)
Types tier 3 (Informative Nature)Descriptive
Diagnostic
Predictive
Prescriptive
MethodsClustering
Classification
Forecasting
Optimization
Narrative analysis
Discourse analysis
Framework analysis
Grounded theory
TechniquesClustering (doubles as technique)
Regression (linear and multivariable)
Nïave Bayes
Cohorts
Factors
Linear Discriminants
Exponential smoothing
Moving average
Neural networks
Decision trees
Evolutionary programming
Fuzzy logic
Text analysis
Coding
Idea pattern analysis
Word frequency

About the Author

Noah is the founder & Editor-in-Chief at AnalystAnswers. He is a transatlantic professional and entrepreneur with 5+ years of corporate finance and data analytics experience, as well as 3+ years in consumer financial products and business software. He started AnalystAnswers to provide aspiring professionals with accessible explanations of otherwise dense finance and data concepts. Noah believes everyone can benefit from an analytical mindset in growing digital world. When he's not busy at work, Noah likes to explore new European cities, exercise, and spend time with friends and family.

File available immediately.

data analysis in experimental research

Notice: JavaScript is required for this content.

Teach yourself statistics

Experimental Design for ANOVA

There is a close relationship between experimental design and statistical analysis. The way that an experiment is designed determines the types of analyses that can be appropriately conducted.

In this lesson, we review aspects of experimental design that a researcher must understand in order to properly interpret experimental data with analysis of variance.

What Is an Experiment?

An experiment is a procedure carried out to investigate cause-and-effect relationships. For example, the experimenter may manipulate one or more variables (independent variables) to assess the effect on another variable (the dependent variable).

Conclusions are reached on the basis of data. If the dependent variable is unaffected by changes in independent variables, we conclude that there is no causal relationship between the dependent variable and the independent variables. On the other hand, if the dependent variable is affected, we conclude that a causal relationship exists.

Experimenter Control

One of the features that distinguish a true experiment from other types of studies is experimenter control of the independent variable(s).

In a true experiment, an experimenter controls the level of the independent variable administered to each subject. For example, dosage level could be an independent variable in a true experiment; because an experimenter can manipulate the dosage administered to any subject.

What is a Quasi-Experiment?

A quasi-experiment is a study that lacks a critical feature of a true experiment. Quasi-experiments can provide insights into cause-and-effect relationships; but evidence from a quasi-experiment is not as persuasive as evidence from a true experiment. True experiments are the gold standard for causal analysis.

A study that used gender or IQ as an independent variable would be an example of a quasi-experiment, because the study lacks experimenter control over the independent variable; that is, an experimenter cannot manipulate the gender or IQ of a subject.

As we discuss experimental design in the context of a tutorial on analysis of variance, it is important to point out that experimenter control is a requirement for a true experiment; but it is not a requirement for analysis of variance. Analysis of variance can be used with true experiments and with quasi-experiments that lack only experimenter control over the independent variable.

Note: Henceforth in this tutorial, when we refer to an experiment, we will be referring to a true experiment or to a quasi-experiment that is almost a true experiment, in the sense that it lacks only experimenter control over the independent variable.

What Is Experimental Design?

The term experimental design refers to a plan for conducting an experiment in such a way that research results will be valid and easy to interpret. This plan includes three interrelated activities:

  • Write statistical hypotheses.
  • Collect data.
  • Analyze data.

Let's look in a little more detail at these three activities.

Statistical Hypotheses

A statistical hypothesis is an assumption about the value of a population parameter . There are two types of statistical hypotheses:

H 0: μ i = μ j

Here, μ i is the population mean for group i , and μ j is the population mean for group j . This hypothesis makes the assumption that population means in groups i and j are equal.

H 1: μ i ≠ μ j

This hypothesis makes the assumption that population means in groups i and j are not equal.

The null hypothesis and the alternative hypothesis are written to be mutually exclusive. If one is true, the other is not.

Experiments rely on sample data to test the null hypothesis. If experimental results, based on sample statistics , are consistent with the null hypothesis, the null hypothesis cannot be rejected; otherwise, the null hypothesis is rejected in favor of the alternative hypothesis.

Data Collection

The data collection phase of experimental design is all about methodology - how to run the experiment to produce valid, relevant statistics that can be used to test a null hypothesis.

Identify Variables

Every experiment exists to examine a cause-and-effect relationship. With respect to the relationship under investigation, an experimental design needs to account for three types of variables:

  • Dependent variable. The dependent variable is the outcome being measured, the effect in a cause-and-effect relationship.
  • Independent variables. An independent variable is a variable that is thought to be a possible cause in a cause-and-effect relationship.
  • Extraneous variables. An extraneous variable is any other variable that could affect the dependent variable, but is not explicitly included in the experiment.

Note: The independent variables that are explicitly included in an experiment are also called factors .

Define Treatment Groups

In an experiment, treatment groups are built around factors, each group defined by a unique combination of factor levels.

For example, suppose that a drug company wants to test a new cholesterol medication. The dependent variable is total cholesterol level. One independent variable is dosage. And, since some drugs affect men and women differently, the researchers include an second independent variable - gender.

This experiment has two factors - dosage and gender. The dosage factor has three levels (0 mg, 50 mg, and 100 mg), and the gender factor has two levels (male and female). Given this combination of factors and levels, we can define six unique treatment groups, as shown below:

Gender Dose
0 mg 50 mg 100 mg
Male Group 1 Group 2 Group 3
Female Group 4 Group 5 Group 6

Note: The experiment described above is an example of a quasi-experiment, because the gender factor cannot be manipulated by the experimenter.

Select Factor Levels

A factor in an experiment can be described by the way in which factor levels are chosen for inclusion in the experiment:

  • Fixed factor. The experiment includes all factor levels about which inferences are to be made.
  • Random factor. The experiment includes a random sample of levels from a much bigger population of factor levels.

Experiments can be described by the presence or absence of fixed or random factors:

  • Fixed-effects model. All of the factors in the experiment are fixed.
  • Random-effects model. All of the factors in the experiment are random.
  • Mixed model. At least one factor in the experiment is fixed, and at least one factor is random.

The use of fixed factors versus random factors has implications for how experimental results are interpreted. With a fixed factor, results apply only to factor levels that are explicitly included in the experiment. With a random factor, results apply to every factor level from the population.

For example, consider the blood pressure experiment described above. Suppose the experimenter only wanted to test the effect of three particular dosage levels - 0 mg, 50 mg, and 100 mg. He would include those dosage levels in the experiment, and any research conclusions would apply to only those particular dosage levels. This would be an example of a fixed-effects model.

On the other hand, suppose the experimenter wanted to test the effect of any dosage level. Since it is not practical to test every dosage level, the experimenter might choose three dosage levels at random from the population of possible dosage levels. Any research conclusions would apply not only to the selected dosage levels, but also to other dosage levels that were not included explicitly in the experiment. This would be an example of a random-effects model.

Select Experimental Units

The experimental unit is the entity that provides values for the dependent variable. Depending on the needs of the study, an experimental unit may be a person, animal, plant, product - anything. For example, in the cholesterol study described above, researchers measured cholesterol level (the dependent variable) of people; so the experimental units were people.

Note: When the experimental units are people, they are often referred to as subjects . Some researchers prefer the term participant , because subject has a connotation that the person is subservient.

If time and money were no object, you would include the entire population of experimental units in your experiment. In the real world, where there is never enough time or money, you will usually select a sample of experimental units from the population.

Ultimately, you want to use sample data to make inferences about population parameters. With that in mind, it is best practice to draw a random sample of experimental units from the population. This provides a defensible, statistical basis for generalizing from sample findings to the larger population.

Finally, it is important to consider sample size. The larger the sample, the greater the statistical power ; and the more confidence you can have in your results.

Assign Experimental Units to Treatments

Having selected a sample of experimental units, we need to assign each unit to one or more treatment groups. Here are two ways that you might assign experimental units to groups:

  • Independent groups design. Each experimental unit is randomly assigned to one, and only one, treatment group. This is also known as a between-subjects design .
  • Repeated measures design. Experimental units are assigned to more than one treatment group. This is also known as a within-subjects design .

Control for Extraneous Variables

Extraneous variables can mask effects of independent variables. Therefore, a good experimental design controls potential effects of extraneous variables. Here are a few strategies for controlling extraneous variables:

  • Randomization Assign subjects randomly to treatment groups. This tends to distribute effects of extraneous variables evenly across groups.
  • Repeated measures design. To control for individual differences between subjects (age, attitude, religion, etc.), assign each subject to multiple treatments. This strategy is called using subjects as their own control.
  • Counterbalancing. In repeated measures designs, randomize or reverse the order of treatments among subjects to control for order effects (e.g., fatigue, practice).

As we describe specific experimental designs in upcoming lessons, we will point out the strategies that are used with each design to control the confounding effects of extraneous variables.

Data Analysis

Researchers follow a formal process to determine whether to reject a null hypothesis, based on sample data. This process, called hypothesis testing, consists of five steps:

  • Formulate hypotheses. This involves stating the null and alternative hypotheses. Because the hypotheses are mutually exclusive, if one is true, the other must be false.
  • Choose the test statistic. This involves specifying the statistic that will be used to assess the validity of the null hypothesis. Typically, in analysis of variance studies, researchers compute a F ratio to test hypotheses.
  • Compute a P-value, based on sample data. Suppose the observed test statistic is equal to S . The P-value is the probability that the experiment would yield a test statistic as extreme as S , assuming the null hypothesis is true.
  • Choose a significance level. The significance level, denoted by α, is the probability of rejecting the null hypothesis when it is really true. Researchers often choose a significance level of 0.05 or 0.01.
  • Test the null hypothesis. If the P-value is smaller than the significance level, we reject the null hypothesis; if it is larger, we fail to reject.

A good experimental design includes a precise plan for data analysis. Before the first data point is collected, a researcher should know how experimental data will be processed to accept or reject the null hypotheses.

Test Your Understanding

In a well-designed experiment, which of the following statements is true?

I. The null hypothesis and the alternative hypothesis are mutually exclusive. II. The null hypothesis is subjected to statistical test. III. The alternative hypothesis is subjected to statistical test.

(A) I only (B) II only (C) III only (D) I and II (E) I and III

The correct answer is (D). The null hypothesis and the alternative hypothesis are mutually exclusive; if one is true, the other must be false. Only the null hypothesis is subjected to statistical test. When the null hypothesis is accepted, the alternative hypothesis is rejected. The alternative hypothesis is not tested explicitly.

In a true experiment, each subject is assigned to only one treatment group. What type of design is this?

(A) Independent groups design (B) Repeated measures design (C) Within-subjects design (D) None of the above (E) All of the above

The correct answer is (A). In an independent groups design, each experimental unit is assigned to one treatment group. In the other two designs, each experimental unit is assigned to more than one treatment group.

In a true experiment, which of the following does the experimenter control?

(A) How to manipulate independent variables. (B) How to assign subjects to treatment conditions. (C) How to control for extraneous variables. (D) None of the above (E) All of the above

The correct answer is (E). The experimenter chooses factors and factor levels for the experiment, assigns experimental units to treatment groups (often through a random process), and implements strategies (randomization, counterbalancing, etc.) to control the influence of extraneous variables.

  • OU Homepage
  • OU Social Media
  • The University of Oklahoma

OU Libraries

Quantitative Research Methods

data analysis in experimental research

Statistical Analysis References

  • Online Statistics Education: An Interactive Multimedia Course of Study This open and free introductory statistics textbook covers topics typical for a college-level non-math majors statistics course. Topics include distributions, probability, research design, estimation, hypothesis testing, power and effect size, comparison of means, regression, analysis of variance (ANOVA), transformations, chi square, and non-parametric (distribution-free) tests). It is available as a pdf, online, or as an epub. An Instructor's Manual and PowerPoint slides are also available upon request from the project leader at Rice University.
  • Introductory Statistics A free and open introductory statistics textbook for non-math majors. "They have sought to present only the core concepts and use a wide-ranging set of exercises for each concept to drive comprehension. [...] a smaller and less intimidating textbook that trades some extended and unnecessary topics for a better-focused presentation of the central material." It covers descriptive statistics, probability, distributions, discrete and continuous random variables, estimation, hypothesis testing, comparison of means, correlation and regression, chi square, and F-tests.
  • Introductory Statistics with Randomization and Simulation "We hope readers will take away three ideas from this book in addition to forming a foundation of statistical thinking and methods. (1) Statistics is an applied field with a wide range of practical applications. (2) You don't have to be a math guru to learn from interesting, real data. (3) Data are messy, and statistical tools are imperfect. However, when you understand the strengths and weaknesses of these tools, you can use them to learn interesting things about the world." This free and open introductory statistics textbook for non-math majors discusses data and data collection, foundations for inference with randomization and simulations (then leading into standard parametric statistics), inference with categorical and numerical data, and linear, multiple logistic regression. An introduction to probability is included as an appendix.

data analysis in experimental research

  • Statistics LibreTexts Bookshelf Curates multiple openly available statistics textbooks.

data analysis in experimental research

  • Encyclopedia of Statistical Sciences "Reference tool covering statistics, probability theory, biostatistics, quality control, and economics with emphasis in applications of statistical methods in sociology, engineering, computer science, biomedicine, psychology, survey methodology, and other client disciplines." A good source for topics less often covered in the general textbooks.
  • The Concise Encyclopedia of Statistics "More than 500 entries include definitions, history, mathematical details, limitations, examples, references, and further readings. All entries include cross-references as well as the key citations. The back matter includes a timeline of statistical inventions." Another good resource for topics not included in the general texts listed previously.

Meta-analysis

  • Meta-Analysis: quantitative methods for research synthesis by Fredric Marc Wolf Call Number: Online access ISBN: 0585216975 Publication Date: 1986 "Meta-Analysis shows how to apply statistical methods to achieve a literature review of a common research domain. It demonstrates the use of combined tests and measures of effect size to synthesise quantitatively the results of independent studies for both group differences and correlations."
  • Meta-Analysis: cumulating research findings across studies by John E. Hunter, Frank L. Schmidt, and Gregg B. Jackson Call Number: Online access ISBN: 0803918631 Publication Date: 1982 "Meta-analysis is a way of synthesizing previous research on a subject in order to assess what has already been learned, and even to derive new conclusions from the mass of already researched data. "

data analysis in experimental research

  • Integrating Omics Data by George Tseng; Debashis Ghosh; Xianghong Jasmine Zhou Call Number: Online access ISBN: 9781107706484 Publication Date: 2015 "As the technologies have become mature and the price affordable, omics data are rapidly generated, and the problem of information integration and modeling of multi-lab and/or multi-omics data is becoming a growing one in the bioinformatics field. This book provides comprehensive coverage of these topics and will have a long-lasting impact on this evolving subject. Each chapter, written by a leader in the field, introduces state-of-the-art methods to handle information integration, experimental data, and database problems of omics data."

data analysis in experimental research

  • Meta-analysis in social research by Gene V. Glass, Barry McGaw, and Mary Lee Smith Call Number: Online access Publication Date: 1981 Covers the problems of research review and integration; meta-analysis of research; finding studies; describing, classifying and coding research studies; measuring study findings; techniques of analysis; and an evaluation of meta-analysis.

data analysis in experimental research

Ordination and Principal Components Analysis (PCA)

  • The Ordination Web Page "designed to address some of the most frequently asked questions about ordination. It is my intention to gear this page towards the student and the practitioner rather than the ordination specialist"
  • Nature Methods: Principal Components Analysis "Principal component analysis (PCA) simplifies the complexity in high-dimensional data while retaining trends and patterns. It does this by transforming the data into fewer dimensions, which act as summaries of features. "
  • A One-Stop Shop for Principal Components Analysis "The rationale for this method, the math under the hood, some best practices, and potential drawbacks to the method."

data analysis in experimental research

  • NIST/SEMATECH e-Handbook of Statistical Methods: Principal Components "A Multivariate Analysis problem could start out with a substantial number of correlated variables. Principal Component Analysis is a dimension-reduction tool that can be used advantageously in such situations. Principal component analysis aims at reducing a large set of variables to a small set that still contains most of the information in the large set. "

Bayesian Statistics

data analysis in experimental research

Experimental Design for Social Sciences

data analysis in experimental research

Qualitative Research Methods

data analysis in experimental research

  • Qualitative Research Methods: A Data Collector's Field Guide "This how-to guide covers the mechanics of data collection for applied qualitative research. It is appropriate for novice and experienced researchers alike. It can be used as both a training tool and a daily reference manual for field team members. The question and answer format and modular design make it easy for readers to find information on a particular topic quickly."

data analysis in experimental research

Experimental Design for Ecology and Evolutionary Biology

data analysis in experimental research

Handbook home

  • Search the Handbook
  • Undergraduate courses
  • Graduate courses
  • Research courses
  • Undergraduate subjects
  • Graduate subjects
  • Research subjects
  • Breadth Tracks
  • CAPS Login - Staff only
  • Experimental Design and Data Analysis

Experimental Design and Data Analysis (MAST10011)

Undergraduate level 1 Points: 12.5 On Campus (Parkville)

View full page

About this subject

Contact information.

Email: [email protected]

Email: [email protected]

Availability
Fees

This subject provides an understanding of the fundamental concepts of probability and statistics required for experimental design and data analysis in the health sciences. Initially the subject introduces common study designs, random sampling and randomised trials as well as numerical and visual methods of summarising data. It then focuses on understanding population characteristics such as means, variances, proportions, risk ratios, odds ratios, rates, prevalence, and measures used to assess the diagnostic value of a clinical test. Finally, after determining the sampling distributions of some common statistics, confidence intervals will be used to estimate these population characteristics and statistical tests of hypotheses will be developed. The presentation and interpretation of the results from statistical analyses of typical health research studies will be emphasised.

The statistical methods will be implemented using a standard statistical computing package and illustrated on applications from the health sciences.

Intended learning outcomes

On completion of the subject, students should be able to:

  • analyse standard data sets, interpreting the results of such analysis and presenting the conclusions in a clear and comprehensible manner;
  • understand a range of standard statistical methods which can be applied to biomedical sciences.
  • use a statistical computing package to analyse biomedical data;
  • choose a form of epidemiological experimental design suitable for a range of standard biomedical experiments.

Generic skills

In addition to learning specific skills that will assist students in their future careers in the health sciences, they will have the opportunity to develop, generic skills that will assist them in any future career path. These include:

  • problem-solving skills: the ability to engage with unfamiliar problems and identify relevant solution strategies;
  • analytical skills: the ability to construct and express logical arguments and to work in abstract or general terms to increase the clarity and efficiency of analysis;
  • collaborative skills: the ability to work in a team;
  • time management skills: the ability to meet regular deadlines while balancing competing commitments;
  • computer skills: the ability to use statistical computing packages.

Last updated: 31 October 2023

  • Privacy Policy

Research Method

Home » Quasi-Experimental Research Design – Types, Methods

Quasi-Experimental Research Design – Types, Methods

Table of Contents

Quasi-Experimental Design

Quasi-Experimental Design

Quasi-experimental design is a research method that seeks to evaluate the causal relationships between variables, but without the full control over the independent variable(s) that is available in a true experimental design.

In a quasi-experimental design, the researcher uses an existing group of participants that is not randomly assigned to the experimental and control groups. Instead, the groups are selected based on pre-existing characteristics or conditions, such as age, gender, or the presence of a certain medical condition.

Types of Quasi-Experimental Design

There are several types of quasi-experimental designs that researchers use to study causal relationships between variables. Here are some of the most common types:

Non-Equivalent Control Group Design

This design involves selecting two groups of participants that are similar in every way except for the independent variable(s) that the researcher is testing. One group receives the treatment or intervention being studied, while the other group does not. The two groups are then compared to see if there are any significant differences in the outcomes.

Interrupted Time-Series Design

This design involves collecting data on the dependent variable(s) over a period of time, both before and after an intervention or event. The researcher can then determine whether there was a significant change in the dependent variable(s) following the intervention or event.

Pretest-Posttest Design

This design involves measuring the dependent variable(s) before and after an intervention or event, but without a control group. This design can be useful for determining whether the intervention or event had an effect, but it does not allow for control over other factors that may have influenced the outcomes.

Regression Discontinuity Design

This design involves selecting participants based on a specific cutoff point on a continuous variable, such as a test score. Participants on either side of the cutoff point are then compared to determine whether the intervention or event had an effect.

Natural Experiments

This design involves studying the effects of an intervention or event that occurs naturally, without the researcher’s intervention. For example, a researcher might study the effects of a new law or policy that affects certain groups of people. This design is useful when true experiments are not feasible or ethical.

Data Analysis Methods

Here are some data analysis methods that are commonly used in quasi-experimental designs:

Descriptive Statistics

This method involves summarizing the data collected during a study using measures such as mean, median, mode, range, and standard deviation. Descriptive statistics can help researchers identify trends or patterns in the data, and can also be useful for identifying outliers or anomalies.

Inferential Statistics

This method involves using statistical tests to determine whether the results of a study are statistically significant. Inferential statistics can help researchers make generalizations about a population based on the sample data collected during the study. Common statistical tests used in quasi-experimental designs include t-tests, ANOVA, and regression analysis.

Propensity Score Matching

This method is used to reduce bias in quasi-experimental designs by matching participants in the intervention group with participants in the control group who have similar characteristics. This can help to reduce the impact of confounding variables that may affect the study’s results.

Difference-in-differences Analysis

This method is used to compare the difference in outcomes between two groups over time. Researchers can use this method to determine whether a particular intervention has had an impact on the target population over time.

Interrupted Time Series Analysis

This method is used to examine the impact of an intervention or treatment over time by comparing data collected before and after the intervention or treatment. This method can help researchers determine whether an intervention had a significant impact on the target population.

Regression Discontinuity Analysis

This method is used to compare the outcomes of participants who fall on either side of a predetermined cutoff point. This method can help researchers determine whether an intervention had a significant impact on the target population.

Steps in Quasi-Experimental Design

Here are the general steps involved in conducting a quasi-experimental design:

  • Identify the research question: Determine the research question and the variables that will be investigated.
  • Choose the design: Choose the appropriate quasi-experimental design to address the research question. Examples include the pretest-posttest design, non-equivalent control group design, regression discontinuity design, and interrupted time series design.
  • Select the participants: Select the participants who will be included in the study. Participants should be selected based on specific criteria relevant to the research question.
  • Measure the variables: Measure the variables that are relevant to the research question. This may involve using surveys, questionnaires, tests, or other measures.
  • Implement the intervention or treatment: Implement the intervention or treatment to the participants in the intervention group. This may involve training, education, counseling, or other interventions.
  • Collect data: Collect data on the dependent variable(s) before and after the intervention. Data collection may also include collecting data on other variables that may impact the dependent variable(s).
  • Analyze the data: Analyze the data collected to determine whether the intervention had a significant impact on the dependent variable(s).
  • Draw conclusions: Draw conclusions about the relationship between the independent and dependent variables. If the results suggest a causal relationship, then appropriate recommendations may be made based on the findings.

Quasi-Experimental Design Examples

Here are some examples of real-time quasi-experimental designs:

  • Evaluating the impact of a new teaching method: In this study, a group of students are taught using a new teaching method, while another group is taught using the traditional method. The test scores of both groups are compared before and after the intervention to determine whether the new teaching method had a significant impact on student performance.
  • Assessing the effectiveness of a public health campaign: In this study, a public health campaign is launched to promote healthy eating habits among a targeted population. The behavior of the population is compared before and after the campaign to determine whether the intervention had a significant impact on the target behavior.
  • Examining the impact of a new medication: In this study, a group of patients is given a new medication, while another group is given a placebo. The outcomes of both groups are compared to determine whether the new medication had a significant impact on the targeted health condition.
  • Evaluating the effectiveness of a job training program : In this study, a group of unemployed individuals is enrolled in a job training program, while another group is not enrolled in any program. The employment rates of both groups are compared before and after the intervention to determine whether the training program had a significant impact on the employment rates of the participants.
  • Assessing the impact of a new policy : In this study, a new policy is implemented in a particular area, while another area does not have the new policy. The outcomes of both areas are compared before and after the intervention to determine whether the new policy had a significant impact on the targeted behavior or outcome.

Applications of Quasi-Experimental Design

Here are some applications of quasi-experimental design:

  • Educational research: Quasi-experimental designs are used to evaluate the effectiveness of educational interventions, such as new teaching methods, technology-based learning, or educational policies.
  • Health research: Quasi-experimental designs are used to evaluate the effectiveness of health interventions, such as new medications, public health campaigns, or health policies.
  • Social science research: Quasi-experimental designs are used to investigate the impact of social interventions, such as job training programs, welfare policies, or criminal justice programs.
  • Business research: Quasi-experimental designs are used to evaluate the impact of business interventions, such as marketing campaigns, new products, or pricing strategies.
  • Environmental research: Quasi-experimental designs are used to evaluate the impact of environmental interventions, such as conservation programs, pollution control policies, or renewable energy initiatives.

When to use Quasi-Experimental Design

Here are some situations where quasi-experimental designs may be appropriate:

  • When the research question involves investigating the effectiveness of an intervention, policy, or program : In situations where it is not feasible or ethical to randomly assign participants to intervention and control groups, quasi-experimental designs can be used to evaluate the impact of the intervention on the targeted outcome.
  • When the sample size is small: In situations where the sample size is small, it may be difficult to randomly assign participants to intervention and control groups. Quasi-experimental designs can be used to investigate the impact of an intervention without requiring a large sample size.
  • When the research question involves investigating a naturally occurring event : In some situations, researchers may be interested in investigating the impact of a naturally occurring event, such as a natural disaster or a major policy change. Quasi-experimental designs can be used to evaluate the impact of the event on the targeted outcome.
  • When the research question involves investigating a long-term intervention: In situations where the intervention or program is long-term, it may be difficult to randomly assign participants to intervention and control groups for the entire duration of the intervention. Quasi-experimental designs can be used to evaluate the impact of the intervention over time.
  • When the research question involves investigating the impact of a variable that cannot be manipulated : In some situations, it may not be possible or ethical to manipulate a variable of interest. Quasi-experimental designs can be used to investigate the relationship between the variable and the targeted outcome.

Purpose of Quasi-Experimental Design

The purpose of quasi-experimental design is to investigate the causal relationship between two or more variables when it is not feasible or ethical to conduct a randomized controlled trial (RCT). Quasi-experimental designs attempt to emulate the randomized control trial by mimicking the control group and the intervention group as much as possible.

The key purpose of quasi-experimental design is to evaluate the impact of an intervention, policy, or program on a targeted outcome while controlling for potential confounding factors that may affect the outcome. Quasi-experimental designs aim to answer questions such as: Did the intervention cause the change in the outcome? Would the outcome have changed without the intervention? And was the intervention effective in achieving its intended goals?

Quasi-experimental designs are useful in situations where randomized controlled trials are not feasible or ethical. They provide researchers with an alternative method to evaluate the effectiveness of interventions, policies, and programs in real-life settings. Quasi-experimental designs can also help inform policy and practice by providing valuable insights into the causal relationships between variables.

Overall, the purpose of quasi-experimental design is to provide a rigorous method for evaluating the impact of interventions, policies, and programs while controlling for potential confounding factors that may affect the outcome.

Advantages of Quasi-Experimental Design

Quasi-experimental designs have several advantages over other research designs, such as:

  • Greater external validity : Quasi-experimental designs are more likely to have greater external validity than laboratory experiments because they are conducted in naturalistic settings. This means that the results are more likely to generalize to real-world situations.
  • Ethical considerations: Quasi-experimental designs often involve naturally occurring events, such as natural disasters or policy changes. This means that researchers do not need to manipulate variables, which can raise ethical concerns.
  • More practical: Quasi-experimental designs are often more practical than experimental designs because they are less expensive and easier to conduct. They can also be used to evaluate programs or policies that have already been implemented, which can save time and resources.
  • No random assignment: Quasi-experimental designs do not require random assignment, which can be difficult or impossible in some cases, such as when studying the effects of a natural disaster. This means that researchers can still make causal inferences, although they must use statistical techniques to control for potential confounding variables.
  • Greater generalizability : Quasi-experimental designs are often more generalizable than experimental designs because they include a wider range of participants and conditions. This can make the results more applicable to different populations and settings.

Limitations of Quasi-Experimental Design

There are several limitations associated with quasi-experimental designs, which include:

  • Lack of Randomization: Quasi-experimental designs do not involve randomization of participants into groups, which means that the groups being studied may differ in important ways that could affect the outcome of the study. This can lead to problems with internal validity and limit the ability to make causal inferences.
  • Selection Bias: Quasi-experimental designs may suffer from selection bias because participants are not randomly assigned to groups. Participants may self-select into groups or be assigned based on pre-existing characteristics, which may introduce bias into the study.
  • History and Maturation: Quasi-experimental designs are susceptible to history and maturation effects, where the passage of time or other events may influence the outcome of the study.
  • Lack of Control: Quasi-experimental designs may lack control over extraneous variables that could influence the outcome of the study. This can limit the ability to draw causal inferences from the study.
  • Limited Generalizability: Quasi-experimental designs may have limited generalizability because the results may only apply to the specific population and context being studied.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Quantitative Research

Quantitative Research – Methods, Types and...

Textual Analysis

Textual Analysis – Types, Examples and Guide

Basic Research

Basic Research – Types, Methods and Examples

Triangulation

Triangulation in Research – Types, Methods and...

Transformative Design

Transformative Design – Methods, Types, Guide

Descriptive Research Design

Descriptive Research Design – Types, Methods and...

data analysis in experimental research

Experimental Research: Meaning And Examples Of Experimental Research

Ever wondered why scientists across the world are being lauded for discovering the Covid-19 vaccine so early? It’s because every…

What Is Experimental Research

Ever wondered why scientists across the world are being lauded for discovering the Covid-19 vaccine so early? It’s because every government knows that vaccines are a result of experimental research design and it takes years of collected data to make one. It takes a lot of time to compare formulas and combinations with an array of possibilities across different age groups, genders and physical conditions. With their efficiency and meticulousness, scientists redefined the meaning of experimental research when they discovered a vaccine in less than a year.

What Is Experimental Research?

Characteristics of experimental research design, types of experimental research design, advantages and disadvantages of experimental research, examples of experimental research.

Experimental research is a scientific method of conducting research using two variables: independent and dependent. Independent variables can be manipulated to apply to dependent variables and the effect is measured. This measurement usually happens over a significant period of time to establish conditions and conclusions about the relationship between these two variables.

Experimental research is widely implemented in education, psychology, social sciences and physical sciences. Experimental research is based on observation, calculation, comparison and logic. Researchers collect quantitative data and perform statistical analyses of two sets of variables. This method collects necessary data to focus on facts and support sound decisions. It’s a helpful approach when time is a factor in establishing cause-and-effect relationships or when an invariable behavior is seen between the two.  

Now that we know the meaning of experimental research, let’s look at its characteristics, types and advantages.

The hypothesis is at the core of an experimental research design. Researchers propose a tentative answer after defining the problem and then test the hypothesis to either confirm or disregard it. Here are a few characteristics of experimental research:

  • Dependent variables are manipulated or treated while independent variables are exerted on dependent variables as an experimental treatment. Extraneous variables are variables generated from other factors that can affect the experiment and contribute to change. Researchers have to exercise control to reduce the influence of these variables by randomization, making homogeneous groups and applying statistical analysis techniques.
  • Researchers deliberately operate independent variables on the subject of the experiment. This is known as manipulation.
  • Once a variable is manipulated, researchers observe the effect an independent variable has on a dependent variable. This is key for interpreting results.
  • A researcher may want multiple comparisons between different groups with equivalent subjects. They may replicate the process by conducting sub-experiments within the framework of the experimental design.

Experimental research is equally effective in non-laboratory settings as it is in labs. It helps in predicting events in an experimental setting. It generalizes variable relationships so that they can be implemented outside the experiment and applied to a wider interest group.

The way a researcher assigns subjects to different groups determines the types of experimental research design .

Pre-experimental Research Design

In a pre-experimental research design, researchers observe a group or various groups to see the effect an independent variable has on the dependent variable to cause change. There is no control group as it is a simple form of experimental research . It’s further divided into three categories:

  • A one-shot case study research design is a study where one dependent variable is considered. It’s a posttest study as it’s carried out after treating what presumably caused the change.
  • One-group pretest-posttest design is a study that combines both pretest and posttest studies by testing a single group before and after administering the treatment.
  • Static-group comparison involves studying two groups by subjecting one to treatment while the other remains static. After post-testing all groups the differences are observed.

This design is practical but lacks in certain areas of true experimental criteria.

True Experimental Research Design

This design depends on statistical analysis to approve or disregard a hypothesis. It’s an accurate design that can be conducted with or without a pretest on a minimum of two dependent variables assigned randomly. It is further classified into three types:

  • The posttest-only control group design involves randomly selecting and assigning subjects to two groups: experimental and control. Only the experimental group is treated, while both groups are observed and post-tested to draw a conclusion from the difference between the groups.
  • In a pretest-posttest control group design, two groups are randomly assigned subjects. Both groups are presented, the experimental group is treated and both groups are post-tested to measure how much change happened in each group.
  • Solomon four-group design is a combination of the previous two methods. Subjects are randomly selected and assigned to four groups. Two groups are tested using each of the previous methods.

True experimental research design should have a variable to manipulate, a control group and random distribution.

With experimental research, we can test ideas in a controlled environment before marketing. It acts as the best method to test a theory as it can help in making predictions about a subject and drawing conclusions. Let’s look at some of the advantages that make experimental research useful:

  • It allows researchers to have a stronghold over variables and collect desired results.
  • Results are usually specific.
  • The effectiveness of the research isn’t affected by the subject.
  • Findings from the results usually apply to similar situations and ideas.
  • Cause and effect of a hypothesis can be identified, which can be further analyzed for in-depth ideas.
  • It’s the ideal starting point to collect data and lay a foundation for conducting further research and building more ideas.
  • Medical researchers can develop medicines and vaccines to treat diseases by collecting samples from patients and testing them under multiple conditions.
  • It can be used to improve the standard of academics across institutions by testing student knowledge and teaching methods before analyzing the result to implement programs.
  • Social scientists often use experimental research design to study and test behavior in humans and animals.
  • Software development and testing heavily depend on experimental research to test programs by letting subjects use a beta version and analyzing their feedback.

Even though it’s a scientific method, it has a few drawbacks. Here are a few disadvantages of this research method:

  • Human error is a concern because the method depends on controlling variables. Improper implementation nullifies the validity of the research and conclusion.
  • Eliminating extraneous variables (real-life scenarios) produces inaccurate conclusions.
  • The process is time-consuming and expensive
  • In medical research, it can have ethical implications by affecting patients’ well-being.
  • Results are not descriptive and subjects can contribute to response bias.

Experimental research design is a sophisticated method that investigates relationships or occurrences among people or phenomena under a controlled environment and identifies the conditions responsible for such relationships or occurrences

Experimental research can be used in any industry to anticipate responses, changes, causes and effects. Here are some examples of experimental research :

  • This research method can be used to evaluate employees’ skills. Organizations ask candidates to take tests before filling a post. It is used to screen qualified candidates from a pool of applicants. This allows organizations to identify skills at the time of employment. After training employees on the job, organizations further evaluate them to test impact and improvement. This is a pretest-posttest control group research example where employees are ‘subjects’ and the training is ‘treatment’.
  • Educational institutions follow the pre-experimental research design to administer exams and evaluate students at the end of a semester. Students are the dependent variables and lectures are independent. Since exams are conducted at the end and not the beginning of a semester, it’s easy to conclude that it’s a one-shot case study research.
  • To evaluate the teaching methods of two teachers, they can be assigned two student groups. After teaching their respective groups on the same topic, a posttest can determine which group scored better and who is better at teaching. This method can have its drawbacks as certain human factors, such as attitudes of students and effectiveness to grasp a subject, may negatively influence results. 

Experimental research is considered a standard method that uses observations, simulations and surveys to collect data. One of its unique features is the ability to control extraneous variables and their effects. It’s a suitable method for those looking to examine the relationship between cause and effect in a field setting or in a laboratory. Although experimental research design is a scientific approach, research is not entirely a scientific process. As much as managers need to know what is experimental research , they have to apply the correct research method, depending on the aim of the study.

Harappa’s Thinking Critically program makes you more decisive and lets you think like a leader. It’s a growth-driven course for managers who want to devise and implement sound strategies, freshers looking to build a career and entrepreneurs who want to grow their business. Identify and avoid arguments, communicate decisions and rely on effective decision-making processes in uncertain times. This course teaches critical and clear thinking. It’s packed with problem-solving tools, highly impactful concepts and relatable content. Build an analytical mindset, develop your skills and reap the benefits of critical thinking with Harappa!

Explore Harappa Diaries to learn more about topics such as Main Objective Of Research , Definition Of Qualitative Research , Examples Of Experiential Learning and Collaborative Learning Strategies to upgrade your knowledge and skills.

Thriversitybannersidenav

U.S. flag

An official website of the United States government

Here's how you know

Official websites use .gov A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS A lock ( Lock A locked padlock ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

  • Experimental Forests and Ranges
  • Urban Field Stations
  • Research Natural Areas
  • Partnerships

Long-Term Selection Cutting Study

Since 1952, scientists have been collecting data from the Cutting Methods Study on the Argonne Experimental Forest in Wisconsin. The study compares how different forest management strategies affect the growth of desirable trees. The original goal was to figure out how to manage trees to get the greatest number of large, healthy trees for timber and other wood products.

Today, NRS research forester Christel Kern and collaborators are revisiting the original study to look at how trees have grown under different management practices over the last 70+ years.

This research evaluates various tree cutting methods–plus a control area with no cutting–on long-term tree growth across three, 40-acre blocks of hardwood trees. The research team is analyzing how the cutting methods influence different patterns of tree growth, death, and renewal in order to meet specific goals.

Looking upward at a tree canopy, a long pole extends up a tree trunk.

Height plot used to remeasure tree quality variables (2017). 

Early Findings 

In the 1960s, early analyses from the Cutting Methods Study found that medium selection produced the best balance of tree growth, timber yield, regrowth of new trees, and tree death. As a result, for more than 50 years, regional forest plans and forest management guides have often recommended variations of medium selection cutting. Medium selection means removing a moderate amount of trees so that a remaining standing crop of forest is stocked at about 75 square feet per acre of tree basal area remain.

Alternative selectio

n cutting approaches are light selection and heavy selection, which leave the residual growing stocks or basal area around 90 square feet and 60 square feet per acre, respectively. Early data analysis from the study found that light selection led to high quality trees remaining in the forest and poor regeneration of new trees. Heavy selection led to greater growth and high economic returns and lower quality trees remaining in the forest. Finally, medium selection produced the highest economic return in term of marketable timber.

Analysis Today

Today, data collection from the Cutting Methods Study and analysis of the historic dataset continues through collaborations between the research team from the Forest Service and universities.

Recent analysis has reaffirmed some study findings from the 1950s while uncovering new insights. Single-tree selection methods, particularly those that involve heavier cutting, are more financially rewarding and beneficial for forest growth. However, it is essential to strike a balance between cutting too much and leaving too many trees standing. This analysis also highlights the importance of long-term monitoring in forest management. While heavy selection can accelerate desired outcomes, it requires careful observation over many years to ensure sustainability.

A man measures a tree’s circumference using a measuring tape.

Forester T. Strong assists with tree size measurements in permanent plots on the Argonne Experimental Forest (2016). 

The original goal of selection cutting was to establish a balanced arrangement in forests with trees of various ages and sizes: many small trees, some medium-sized ones, and a few high-quality large sawlog trees. The idea was to harvest trees from all size classes gradually, allowing the stand to regenerate and maintain its original structure. However, a study published in 2019 found that this method hasn’t yet created a balanced uneven-aged stand. Instead of maintaining a varied age structure (uneven-aged), some forests continue to show characteristics of even-aged stands. This suggests that selection cuttings might take a long time to stabilize forests that are initially even-aged. At the same time, forests that were already uneven-aged the selection cutting method seemed to work well and kept the structure stable.

The Cutting Methods study’s focus on conventional forest management techniques means that it may not address all aspects of sustainable forest management, such as indigenous values, wildlife habitat, and carbon sequestration. Further research is needed to explore these aspects fully.

Despite its limitations, this long-term study is providing valuable insights into forest management practices and offering guidance to forest managers striving to balance economic and ecological goals. It also underscores the importance of continued investment in long-term research to ensure the health and sustainability of our forests for future generations.

A sign that reads “Entering Argonne Experimental Forest, Northern Research Station.”

The Argonne Experimental Forest in northern Wisconsin was established in 1947. Most studies on the forest focus on how to bring second-growth northern hardwoods under management. 

At its inception, the forest was primarily composed of second-growth, even-aged trees, around 45 years old, with some older, low-quality trees scattered throughout. Within the study stands, different silvicultural treatments were randomly assigned to 1-hectare units. The treatments included variations of single-tree selection, diameter-limit cutting, crop tree release, and a control.

To assess the impact of these treatments, researchers established permanent sample plots in 1951 and recorded tree diameter, species, and other relevant data at regular intervals. They also evaluated factors like growth, mortality, and yield using sophisticated statistical analyses, including ANOVA and diameter-increment models.

Additionally, a financial analysis was conducted to measure the economic success of each treatment, focusing on harvested lumber volumes and values. This analysis used managed forest value (MFV) to evaluate the long-term sustainability of each silvicultural system.

The selection cuttings in this study have occurred every 10 years and have removed trees with poor form, overtopped crowns, and defects.

Cutting Methods Study Details

  • The study has used three levels of tree cutting intensity: light selection (harvesting trees until about 90 square feet per acre of basal area remain), medium selection (75 square feet per acre of post-harvest basal area), and heavy selection (60 square feet per acre of post-harvest basal area).
  • Diameter-limit cutting involves removing all trees 7 inches or more in diameter. The diameter limit was aimed to simply extract merchantable timber over a specified size threshold without consideration for the remaining forest or regeneration of the future forest.
  • Crop tree release (also known as thinning), involves removing selected trees in younger, crowded forests to create a faster-growing, healthier forest with a greater number of desirable trees for timber. The crop tree release was a thinning tool to tend and improve the entire stand as one age class until maturity.
  • Additional treatments have been applied but were not re-evaluated in the most recent analysis, including clearcutting, group selection, and strip cutting.

269

Christel C. Kern

Collaborators.

Robert Froese and Maeve Draper,  Michigan Technological University

Ralph Nyland and Sarita Bassil, SUNY ESF

Laura Kenefic, Northern Research Station

Publications

  • Maeve C. Draper, Christel C. Kern, Robert E. Froese. 2021. Growth, yield, and financial return through six decades of various management approaches in a second-growth northern hardwood forest
  • Christel C. Kern, Laura S. Kenefic, Christian Kuehne, Aaron R. Weiskittel, Sarah J. Kaschmitter, Anthony W. D'Amato, Daniel C. Dey, John M. Kabrick, Brian J. Palik, Thomas M. Schuler. 2021. Relative influence of stand and site factors on aboveground live-tree carbon sequestration and mortality in managed and unmanaged forests
  • Sarita Bassil, Ralph D. Nyland, Christel C. Kern, Laura S. Kenefic. 2019. Dynamics of the diameter distribution after selection cutting in uneven- and even-aged northern hardwood stands: a long-term evaluation
  • Kelly E. Gleason, John B. Bradford, Alessandra Bottero, Anthony W. D'Amato, Shawn Fraver, Brian J. Palik, Michael A. Battaglia, Louis Iverson, Laura Kenefic, Christel C. Kern. 2017. Competition amplifies drought stress in forests across broad climatic and compositional gradients
  • M.T. Curzon, A.W. D'Amato, S. Fraver, B.J. Palik, A. Bottero, J.R. Foster, K.E. Gleason. 2017. Harvesting influences functional identity and diversity over time in forests of the northeastern U.S.A.
  • Christel Kern, Gus Erdmann, Laura Kenefic, Brian Palik, Terry Strong. 2014. Development of the selection system in northern hardwood forests of the Lake States: an 80-year silviculture research legacy
  • Matthew Powers, Randall Kolka, Brian Palik, Rachel McDonald, Martin Jurgensen. 2011. Long-term management impacts on carbon storage in Lake States forests
  • Coeli M. Hoover. 2011. Management impacts on forest floor and soil organic carbon in northern temperate forests of the US
  • Anthony W. D'Amato, John B. Bradford, Shawn Fraver, Brian J. Palik. 2011. Forest management for mitigation and adaptation: insights from long-term silvicultural experiments
  • Rachel A. Tarpey, Martin F. Jurgensen, Brian J. Palik, Randy K. Kolka. 2008. The long-term effects of silvicultural thinning and partial cutting on soil compaction in red pine ( Pinus resinosa Ait.) and northern hardwood stands in the northern Great Lakes Region of the United States
  • Christel C. Kern, Brian J. Palik, Terry F. Strong. 2006. Ground-layer plant community responses to even-age and uneven-age silvicultural treatments in Wisconsin northern hardwood forests
  • Richard M. Godman, Joseph J. Mendel. 1978. Economic values for growth and grade changes of sugar maple in the Lake States.
  • Gayne G. Erdmann, Robert R. Oberg. 1973. Fifteen-year results from six cutting methods in second-growth northern hardwoods.
  • Open access
  • Published: 25 August 2024

Comparison of the SBAR method and modified handover model on handover quality and nurse perception in the emergency department: a quasi-experimental study

  • Atefeh Alizadeh-risani 1 ,
  • Fatemeh Mohammadkhah 2 ,
  • Ali Pourhabib 2 ,
  • Zahra Fotokian 2 , 4 &
  • Marziyeh Khatooni 3  

BMC Nursing volume  23 , Article number:  585 ( 2024 ) Cite this article

130 Accesses

Metrics details

Effective information transfer during nursing shift handover is a crucial component of safe care in the emergency department (ED). Examining nursing handover models shows that they are frequently associated with errors. Disadvantages of the SBAR handover model include uncertainty of nursing staff regarding transfer of responsibility and non-confidentiality of patient information. To increase reliability of handover, written forms and templates can be used in addition to oral handover by the bedside.

The purpose of this study is to compare the ‘Situation, Background, Assessment, Recommendation (SBAR) method and modified handover model on the handover quality and nurse perception of shift handover in the ED.

This research was designed as a semi-experimental study, with census survey method used for sampling. In order to collect data, Nurse Perception of Hanover Questionnaire (NPHQ) and Handover Quality Rating Tool (HQRT) were used after translating and confirming validity and reliability used to direct/collect data. A total of 31 nurses working in the ED received training on the modified shift handover model in a one-hour theory session and three hands-on bedside training sessions. This model was implemented by the nurses for one month. Data was analyzed with SPSS (version 26) using paired t-tests and analysis of covariance.

Results indicated significant difference between the modified handover model and SBAR in components of information transfer ( P  < 0.001), shared understanding ( P  < 0.001), working atmosphere ( P  = 0.004), handover quality ( P  < 0.001), and nurse perception of handover ( P  < 0.001). The univariate covariance test did not show demographic variables to be significantly correlated with handover perception or handover quality in SBAR and modified methods ( P  > 0.05).

Conclusions

The results of this study can be presented to nursing managers as a guide in improving the quality of nursing care via implementing and applying the modified handover model in the nursing handover. The resistance of nurses against executing a new handover method was one of the limitations of the research, which was resolved by explanation of the plan and goals, as well as the cooperation of the hospital matron, and the ward supervisor. It is suggested to carry out a similar investigation in other hospital departments and contrast the outcomes with those obtained in the current study.

Peer Review reports

Introduction

One of the professional responsibilities of nurses in delivering high-quality and safe nursing care is the handover process [ 1 ]. This concept refers to the process of transferring the responsibility of care and patient information from one caregiver to another, in order to continue the care of the patient [ 2 ]. Effective information transfer during nursing shift handover is considered a vital component of safe care in the Emergency Department (ED). Some challenges in providing accurate information during handover include providing excessive or insufficient information, lack of a checklist, and delays in handover [ 3 ]. Incomplete transmission of information increases the occurrence of errors, leads to inappropriate treatment, delays diagnosis and treatment, and increases physician and nursing errors and treatment costs [ 4 ]. A study by Spooner showed that 80% of serious medical care errors are related to nursing handovers, and one fifth of patients suffer from complications due to handover errors [ 5 ]. A review of 3000 sentinel events demonstrated that a communication breakdown occurred 65–70% of the time. It has been demonstrated that poor communication handovers result in adverse events, delays in treatment, redundancies that impact efficiencies and effectiveness, low patient and healthcare provider satisfaction, and more admissions [ 3 ].

There are various nursing handover methods, including oral handover, and the use of special forms [ 6 ]. The oral handover method at the bedside can lead to better communication, improved patient care, and increased patient satisfaction [ 7 ]. So far, several shift handover tools have been developed in hospital departments, including: ISOBAR [ 8 ], ISBAR [ 9 ], SBAR [ 3 ], REED [ 10 ], ICCCO [ 11 ], VITAL and PVITAL [ 12 ] and the modified nursing handover model [ 13 ]. Examining nursing handover models shows that they are frequently associated with errors [ 14 ]. While a format to use for a handover was the topic of study in several of the nursing studies [ 15 , 16 , 17 , 18 ], accuracy of content and outcomes were not included. Barriers and facilitators to nursing handovers were identified, but evidence for best practice was not evident. Various strategies have been developed to enhance the effectiveness and efficiency of nursing handover, including standardized approaches, bedside handover and technology. The majority of these models have been evaluated in inpatient settings; few have been conducted in the ED. Among these shift handover models, the PVITAL model was specifically designed for the ED and includes components of Present patient, Intake and output, Treatment and diagnosis, Admission and discharge, and Legal and documentation. Despite the positive aspects, this model has inconsistencies that question its effectiveness in nursing shift handovers [ 13 ]. Also, one of the most widely used shift handover is the SBAR model [ 19 ]. The SBAR model includes Situation, Background, Assessment, and Recommendation components. SBAR is an information tool that transmits standardized information and makes reports concise, targeted and relevant, and facilitates information exchanges, and can be improved by involving the patient in delivery and transformation [ 20 ]. The SBAR handover model was proposed by the joint commission with the aim of reducing errors and increasing the quality of care. This model was initially designed by Leonard and Graham for use in health care systems [ 3 ]. In 2013, adoption of this model for nursing handovers was announced mandatory by the Deputy Minister of Nursing of Iran Ministry of Health [ 21 ]. Currently, this model is only implemented orally at the patient bedside [ 22 ]. Disadvantages of this model include uncertainty of nursing staff regarding transfer of responsibility and non-confidentiality of patient information. To increase reliability of handover, written forms and templates can be used in addition to oral and face-to-face handover by the bedside [ 23 ]. In this regard, the modified nursing handover model was first designed by Klim et al. (2013) for shift handover in the ED. This method has a written form and template and includes components of identification and alert, assessment and progress, nursing care need, plan, and alerting the nurse in charge/medical officer based on vital sign parameters or clinical deterioration [ 24 ]. Findings of a study by Kerr (2016) showed that implementation of this model improves transmission of important information to nurses in subsequent shifts, leading to an increase in participation of patients and their companions in the handover process [ 13 ].

The use of a simple, structured, and standard model with a written template in nursing handovers is one of the elements influencing provision of appropriate services. According to research, implementation of the modified handover model in Iran has not been investigated to date. Despite the widespread use of SBAR, there is limited comparative research on its effectiveness relative to modified handover models in emergency settings. We hypothesize that the modified model will result in fewer handover errors compared to the SBAR method. This study aims to compare the effectiveness of the SBAR method and modified handover model on handover quality and nurse perception in the ED.

Materials and methods

This research was designed as a pre-post intervention, semi-experimental study, with census survey method used for sampling.

Participants

The study location was the ED of Zakaria Razi Social Security Hospital in Qazvin, Iran. The sample size was selected through a census of nurses working in the ED of Zakariya Razi Hospital in Qazvin. There were 45 nurses working in the emergency department, including 38 nurses, one head nurse, one assistant head nurse (staff), three triage nurses and two outpatient operating room nurses. Six nurses had less than six months of work experience in the ED and were not included in the study according to the inclusion criteria. Considering a Cohen’s effect size of 0.52 (based on a pilot sample of the dependent variable, quality of shift handover), with a Type I error rate of 5% and a statistical power of test 80%, the sample size was estimated to be 32 individuals using GPOWER software. A total of 32 nurses were included in the study, but one nurse withdrew from participation, resulting in a final sample size of 31 nurses. The inclusion criteria comprised willingness to participate in the study, and at least 6 months of working experience in the ED. Unwillingness to continue cooperation was set as one of the exclusion criteria.

Data collection (procedures)

Initially, the researcher made a list of the nurses employed in the ED. The nurses were then introduced to the study and its objectives, and participants were selected based on inclusion criteria and obtaining informed consent to participate in the study. The SBAR model was routinely implemented orally in the ED. At the beginning of the research, Nurse Perception of Hanover Questionnaire (NPHQ) and Handover Quality Rating Tool (HQRT) were completed by all participants. Owing to lack of familiarity with the modified handover model, nurses were educated via a one-hour theory session in the hospital conference hall, where the items of the modified nursing handover checklist and how to complete it were taught using PowerPoint and a whiteboard. Three hands-on training sessions was individually held for all nurses explaining the handover model, how to fill out the checklist and use the checklist during shift handover at the patient’s bedside. In order to resolve ambiguities and questions, we communicated with the participants through cyberspace. Brainstorming, clear explanations, effective communication, and receiving feedback were used for more productive training sessions. Moreover, the modified handover checklist was designed by the researcher and provided to the nurses for better understanding of the contents. Subsequently, the modified handover model was implemented by the participants for one month [ 13 ]. During this month, about 350 shift handovers were made with the modified handover method. In order to ensure proper implementation, the researcher attended and directly supervised all handover situations involving the target group. After implementation of the modified handover model, NPHQ and HQRT were completed once more by the participants (Fig.  1 ).

figure 1

The process of implementing the modified nursing handover model

Data collection

Instruments

Demographic information : included variables of age, gender, marital status, level of education, employment type, years of work experience, years of work experience in the ED, working conditions in terms of shifts.

Nurse handover perception questionnaire (NHPQ) : This 22-item questionnaire reveals perception and performance of nurses regarding shift handover. The first half of the NHPQ examines perceptions regarding current practices and essential components of handover [ 15 ]. The second half of the NHPQ, reviews nurse views regarding bedside handover [ 23 ]. The items in the NHPQ questionnaire include a series of statements about nurses’ general understanding of shift handover and their experiences of clinical shift handover at the bedside. This tool is scored on a 4-point Likert scale, with scores ranging from 22 to 88. A higher score indicates a higher perception of handover. Eight items of this questionnaire [ 3 , 4 , 8 , 10 , 17 , 20 , 21 ] are scored negatively. Content validity was reported using a content validity index (CVI) of 0.92, which indicated satisfactory content validity. The internal reliability of the questionnaire items was determined using Cronbach’s alpha of 0.99. The one-dimensional Intraclass Correlation Coefficient (ICC) for the internal homogeneity test of the items was 0.92 [ 23 ].

Handover quality rating tool (HQRT) : The handover quality rating tool has been developed to evaluate the shift handover quality. This 16-item questionnaire includes five components of information transfer (items 1 to 7), shared understanding (items 8 to 10), working atmosphere (items 11 to 13), handover quality (item 14), and circumstances of the handover (items 15 and 16). This questionnaire is scored on a 4-point Likert scale, with the scores ranging from 16 to 64. A higher score indicates better handover quality [ 24 ]. A study reported the validity of this tool with a reliability coefficient of 0.67 [ 25 ].

The above questionnaires have not been used in Iran to date. Therefore, they were translated and validated in the present study, as part of a master’s thesis in internal-surgical nursing [ 26 ]. The results related to the process of translating the questionnaires are summarized as follows:

Getting permission from the tool designer;

Translation from the reference language (English) to the target language (Persian): In this study, two translators familiar with English performed the translation from the original language to Persian. The translation process was carried out independently by the two translators.

Consolidation and comparison of translations: At this stage, the researchers held a meeting to review the translated questionnaires in order to identify and eliminate inappropriate phrases or concepts in the translation. The original version and the translated versions were checked for any discrepancies. The translated versions were combined and a single version was developed.

Translation of the final translated version from the target language (Persian) to the original language (English): This translation was performed by two experts fluent in English. The translated versions were reviewed by the research team and discussed until a consensus was reached. Subsequently, the Persian questionnaires were distributed to ten faculty members to assess content validity, and to twenty nurses working in the ED to evaluate reliability. This process was conducted twice, with a gap of 10 days between each administration. After making necessary corrections, the final version of the questionnaire was prepared. In the present study, all items of the NHPQ and HQRT had a CVI above 0.88, which is acceptable. SCVI/UA was 0.86 and 0.87 for NHPQ and HQRT respectively. SCVI/AVE of both questionnaires was 0.98, which is in the acceptable range. CVR of all items of both questionnaires was above 0.62. Cronbach’s alpha coefficient was 0.93 for NHPQ and 0.96 for HQRT. Hence, the reliability of the tools was confirm [ 26 ].

Data analysis

Descriptive and inferential statistics were used for data analysis using SPSS software (version 24). Paired t-tests, chi-square and analysis of variance were used to compare the effect of SBAR and the modified handover models. P  Value of < 0.05 was considered significant.

Nurse characteristics

The average age of the participants was 33 ± 4 years. Seventeen (54.8%) were women, and 22 (71%) were married. Thirty (96.8%) had a bachelor’s degree, and 23 (74.2%) were officially employed. Fourteen (45.2%) had a work experience of 6–10 years, while 16 (51.6%) had less than 5 years of work experience (Table  1 ).

According to paired t-test results, significant difference existed between the average handover quality of the SBAR model and the modified handover model ( P  < 0.001). Accordingly, the average quality of handover in the modified handover model (57.64) was 8.09 units higher than the SBAR model (49.54). Also, based on paired t-test results, there was significant difference between the two models in components of information transfer ( P  < 0.001), shared understanding ( P  < 0.001), working atmosphere ( P  = 0.004), and handover quality ( P  < 0.001). Meanwhile, the component of circumstances of the handover, was not significantly different between the two models ( P  = 0.227). Therefore, our findings indicated that handover quality and its components (except circumstances of the handover) were higher in the modified handover model compared with the SBAR model. Findings from the analysis of Cohen’s d effect size indicated that the modified handover model has a significantly greater influence on the quality of handover, being 1.29 times higher than the SBAR model. According to results, the modified handover model had the largest effect on the information transfer component with an effect size of 1.56 units, and the smallest effect on the circumstances of the handover with an effect size of 0.23 units (Table  2 ).

Results of the paired t-test revealed significant difference between the average nurse perception of handover in two models of SBAR and modified handover ( P  < 0.001). The average nurse perception of handover was 9.64 units higher in the modified handover model (80.45) compared with the SBAR model (70.80). The results of Cohen’s d effect size showed that the modified handover model is 1.51 times more effective than the SBAR model on nurses’ perception of handover (Table  2 ).

The results of the paired t-test demonstrated that all items except “not enough time allowed”, “there was a tension between the team”, “the person handing over under pressure”, and “the person receiving under pressure”, were significantly different between the two models ( P  < 0.05). Hence, comparing the two models according to Cohen’s effect size, the largest and smallest effect sizes belonged to the items “use of available documentation (charts, etc.)” (1.39) and “the person receiving under pressure” (0.16), respectively (Table  3 ).

Most of the information I receive during shift handover is not related to the patient under my care.

Noise interferes with my ability to concentrate during shift handover.

I believe effective communication skills (such as clear and calm speech) should be used in handover.

In my experience, shift handover is often disrupted by patients, companions or other staff.

After handover, I seek additional information about patients from another nurse or the nurse in charge.

I believe this shift handover model is time consuming.

According to calculated Cohen’s effect sizes, the largest and smallest effect sizes of the modified handover model in comparison with the SBAR method belonged to “I receive sufficient information on nursing care (activity, nutrition, hydration, and pain) during the shift handover” (1.54) and “I believe this shift handover model is time consuming” (0.024), respectively (Table  4 ).

Univariate covariance analysis was used to determine the relationship of demographic variables with nurse perception of handover and the quality of handover. Due to a quantitative nature, the age variable was entered as a covariate and other variables as factors. The results revealed that demographic variables do not have a significant effect on nurses’ perception of handover or the quality of handover in either of the two models ( P  > 0.05).

The present study was conducted with the aim of comparing the effect of implementing SBAR and modified handover models on handover quality and nurse perception of handover in the ED. Based on our findings, implementation of the modified handover model has a more favorable effect on the average handover quality and nurse perception scores compared with the SBAR method. The modified handover model was first designed by Klim et al. (2013), by modifying the components of the SBAR model via group interviews in the ED (17). The modified handover model focused on a standardized approach, including checklists, with emphasis on nursing care and patient involvement. This handover model in the ED enhanced continuity of nursing care, and aspects of the way in which care was implemented and documented, which might translate to reduced incidence of adverse events in this setting. Improvements observed in this current study, such as application of charts for medication, vital signs, allergies, and fluid balance to review patient nursing care, and receiving sufficient information on nursing care (activity, nutrition, hydration, and pain) during the shift handover might help prevent adverse events, including medication errors and promoted handover quality.

Another component of the new handover model was that handover should be conducted in the cubicle at the bedside and involve the patient and/or their companion. More recently, it has been shown that family members also value the opportunity to participate in handover, which promotes family-centered care. Hence, there are disparate opinions between nurses, patients and their family about whether patients should participate in handover. Florin et al. suggest that nurses should establish patient preferences for the degree of their participation in care [ 27 ]. In a phenomenological study, Frank et al. found that ED patients want to be acknowledged; however, they struggle to become involved in their care. In this current study, handover was more likely to be conducted in front of the patient, and more patients had the opportunity to contribute to and/or listen to handover discussion after the introduction of the ED structured nursing handover framework [ 28 ].

Preliminary data showed that there was mixed opinion regarding the appropriate environment for inter-shift handover in the ED. The framework was specifically modified to address deficits in nursing care practice, effect on handover quality and nurse perception of handover. For example, emphasis was placed on viewing the patient’s charts for medication, vital signs and fluid balance. This provides an opportunity for omissions of information, documentation, or care to be identified and addressed at the commencement of a shift. The results of a study by Kerr (2016) demonstrated that implementation of this model improves the transfer of important information to nurses of subsequent shifts and does not possess the shortcomings of the SBAR model [ 13 ].

Accordingly, implementing the modified handover model, improves bedside handover quality from 62.5 to 93%, patient participation in the handover process from 42.1 to 80%, information transfer from 26.9 to 67.8%, identification of patients with allergies from 51.2 to 82%, the amount of documentation from 82.6 to 94.1%, and the use of charts and documentation during handover from 38.7 to 60.8%, meanwhile decreasing omission of essential information such as vital signs from 50 to 32.2%. The authors concluded that implementation of the modified handover model increases documentation, improves nursing care, improves receiving information, enhances patient participation during handover, reduces errors in care and documentation, and promotes bedside handover. A good quality handover facilitates the transfer of information, mutual understanding, and a good working environment [ 13 ]. These findings are consistent with the results of current study.

Moreover, Beigmoradi (2019) showed that in the SBAR model, less attention is paid to clinical records and evaluation of patient body systems during the handover [ 29 ].

Patients are treated urgently in the ED, with the goal of a comprehensive handover immediately. Meanwhile, the non-comprehensive handover model causes a halt in the flow of information, which reduces the handover efficiency. In contrast, the results of a study by Li et al. (2022) demonstrated that implementing a combined model of SBAR and mental map, leads to a significant improvement in the quality of handover and nurse perception of the patient, while reducing defects in shift handover [ 30 ]. Kazemi et al. (2016) showed that patient participation in the handover process increases patient and nurse satisfaction and helps inform patients of their care plan [ 22 ].

According to our findings, demographic variables do not have a significant effect on nurses’ perception of handover and the quality of handover in SBAR or modified handover models. The results of this study can be compared with the results of others in some aspects. Mamallalala et al. (2017) showed that there is significant difference between experience and information transfer of information during shift handover. Hence, nurses with an experience of more than 10 years show higher levels of shared communication and information transfer during shift handover [ 31 ]. The findings of the study by Zakrison et al. (2016) also demonstrated that more experienced nurses are more concerned about transferring information compared with the less experienced [ 32 ], which is not consistent with the results of the present study. The reason for this discrepancy may be the different characteristics of the study samples in the two studies.

The findings of the present study demonstrated that the modified handover model demonstrably improves Shift handover quality, Information transfer, Shared understanding and Perception of handover in the ED. Hence, the results of this study can be presented to nursing managers and quality improvement managers of hospitals as a guide in improving the quality of nursing care via implementing and applying this strategy in the nursing handover. The ED structured nursing modified handover framework focused on a standardized approach, including checklists, with emphasis on nursing care and patient involvement. This straightforward and easy-to-implement strategy has the potential to enhance continuity of care and completion of aspects of nursing care tasks and documentation in the ED.

Strengths and limitations

The present research is the first study to investigate the effect of the modified handover model on handover quality and nurses’ perception of handover in Iran.

The modified handover model tool is a reliable and validated tool that can be easily implemented in ED practice for sharing information among health care providers; however, there are limitations of use in patients with complex medical histories and care plans, especially in the critical care setting. In addition, the modified handover model tool requires training all clinical staff so that they can understand communication well. Future research might test whether introduction of this handover model in the ED setting results in actual enhanced patient safety, including reduction in medication errors.

The resistance of nurses against executing a new handover method was one of the limitations of the research, which was resolved by explanation of the plan and goals, as well as the cooperation of the hospital matron, and the ward supervisor.

Key points for policy, practice and/or research

The results of this study can provide nursing managers with a model of nursing shift handover that promotes the quality of nursing care and patient-related concepts. Interventions could target a combination of the content, communication method, and location aspects of the modified handover model.

Implementing a standardized handover framework such as the modified handover model method allows for concise and comprehensive information handoffs.

The modified handover model tool might be an adaptive tool that is suitable for many healthcare settings, in particular when clear and effective interpersonal communication is required.

The modified handover model provides an opportunity for omissions of information, documentation, or care to be identified and addressed at the commencement of a shift.

Future research

Future studies on the validation of the modified handover model tool in various medical fields, strategies to reinforce the use of the modified handover model tool during all patient-related communication among health care providers, and comparison studies on the modified handover model tool communication tool would be beneficial.

Translation of these findings for enhanced patient safety should be measured in the future, along with sustainability of the new nursing process and external validation of the findings in other settings.

Data availability

The datasets used and/or analysed during the current study are available from the corresponding author upon reasonable request.

Vaismoradi M, Tella S, Logan A, Khakurel P, J. and, Vizcaya-Moreno F. Nurses’ adherence to patient safety principles: a systematic review. Int J Environ Res Public Health. 2020;17(6):2028–43.

Article   PubMed   PubMed Central   Google Scholar  

Kim EJ, Seomun G. Handover in nursing: a concept analysis. Res Theory Nurs Pract. 2020;34(4):297–320.

Article   PubMed   Google Scholar  

Kerr D, Lu S, Mckinlay L. Bedside handover enhances completion of nursing care and documentation. J Nurs Care Qual. 2013;28:217–25.

Smeulers M, Lucas C, Vermeulen H. Effectiveness of different nursing handover styles for ensuring continuity of information in hospitalized patients. Cochrane Database Syst Reviews. 2014;6:CD009979.

Google Scholar  

Spooner AJ, Aitken LM, Corley A, Fraser JF, Chaboyer W. Nursing team leader handover in the intensive care unit contains diverse and inconsistent content: an observational study. Int J Nurs Stud. 2016;61:165–72.

Article   PubMed   CAS   Google Scholar  

Bressan V, Cadorin L, Pellegrinet D, Bulfone G, Stevanin S, Palese A. Bedside shift handover implementation quantitative evidence: findings from a scoping review. J Nurs Adm Manag. 2019;27(4):815–32.

Article   Google Scholar  

Bradley S, Mott S. Adopting a patient-centered approach: an investigation into the introduction of bedside handover to three rural hospitals. J Clin Nurs. 2014;23(13–14):1927–36.

Yee KC, Wong MC, Turner P. HAND ME AN ISOBAR: a pilot study of an evidence-based approach to improving shift‐to‐shift clinical handover. Med J Aust. 2009;190(S11):S121–4.

Thompson JE, Collett LW, Langbart MJ, Purcell NJ, Boyd SM, Yuminaga Y, et al. Using the ISBAR handover tool in junior medical officer handover: a study in an Australian tertiary hospital. Postgrad Med J. 2011;87(1027):340–4.

Tucker A, Fox P. Evaluating nursing handover: the REED model. Nurs Standard. 2014;28(20):44–8.

Bakon S, Wirihana L, Christensen M, Craft J. Nursing handovers: an integrative review of the different models and processes available. Int J Nurs Pract. 2017;23(2):e12520.

Cross R, Considine J, Currey J. Nursing handover of vital signs at the transition of care from the emergency department to the inpatient ward: an integrative review. J Clin Nurs. 2019;28(5–6):1010–21.

Kerr D, Klim S, Kelly AM, McCann T. Impact of a modified nursing handover model for improving nursing care and documentation in the emergency department: a pre-and post‐implementation study. Int J Nurs Pract. 2016;22(1):89–97.

Burgess A, van Diggele C, Roberts C, Mellis C. Teaching clinical handover with ISBAR. BMC Med Educ. 2020;20(2):1–8.

Riesenberg LA, Leitzsch J, Cunningham JM. Nursing handoffs: a systematic review of the literature: surprisingly little is known about what constitutes best practice. Am J Nurs. 2010;110(4):24–36.

Staggers N, Clark L, Blaz JW, Kapsandoy S. Nurses’ information management and use of electronic tools during acute care handoffs. West J Nurs Res. 2012;34(2):153–73.

Staggers N, Clark L, Blaz JW, Kapsandoy S. Why patient summaries in electronic health records do not provide the cognitive support necessary for nurses’ handoffs on medical and surgical units: insights from interviews and observations. Health Inf J. 2011;17(3):209–23.

Porteous JM, Stewart-Wynne EG, Connolly M, Crommelin PF. ISoBAR—a concept and handover checklist: the National Clinical Handover Initiative. Med J Aust. 2009;190(11):S152–6.

PubMed   Google Scholar  

Moi EB, Söderhamn U, Marthinsen GN, Flateland S. The ISBAR tool leads to conscious, structured communication by healthcare personnel. Sykepleien Forskning. 2019;14(74699):e–74699.

Iran Ministry of Health and Medical Education. Instruction of nursing shift handover. Iran Ministry of Health and Medical Education (MOHME); 2017.

Klim S, Kelly AM, Kerr D, Wood S, McCann T. Developing a framework for nursing handover in the emergency department: an individualized and systematic approach. J Clin Nurs. 2013;22(15–16):2233–43.

Clari M, Conti A, Chiarini D, Martin B, Dimonte V, Campagna S. Barriers to and facilitators of Bedside nursing handover: a systematic review and meta-synthesis. J Nurs Care Qual. 2021;36(4):E51–8.

Cho S, Lee JL, Kim KS, Kim EM. Systematic review of quality improvement projects related to intershift nursing handover. J Nurs Care Qual. 2022;37(1):E8–14.

Tortosa-Alted R, Martínez-Segura E, Berenguer-Poblet M, Reverté-Villarroya S. Handover of critical patients in urgent care and emergency settings: a systematic review of validated assessment tools. J Clin Med. 2021;10(24):5736.

Halm MA. Nursing handoffs: ensuring safe passage for patients. Am J Crit Care. 2013;22(2):158–62.

Kazemi M, Sanagoo A, Joubari L, Vakili M. THE effect of delivery nursing shift at bedside with patient’s partnership on patients’ satisfaction and nurses’ satisfaction, clinical trial, quasi-experimental study. Nurs Midwifery J. 2016;14(5):426–36.

Florin J, Ehrenberg A, Ehnfors M. Patient participation in clinical decision-making in nursing: a comparative study of nurses’ and patients’ perceptions. J Clin Nurs. 2006;15:1498–508.

Frank C, As M, Dahlberg K. Patient participation in emergency care–a phenomenographic study based on patients’ lived experience. Int Emerg Nurs. 2009;17(1):15–22.

Beigmoradi S, Pourshirvani A, Pazokian M, Nasiri M. Evaluation of nursing handoff skill among nurses using Situation-background-assessment-recommendation Checklist in General wards. Evid Based Care. 2019;9(3):63–8.

Li X, Zhao J, Fu S. SBAR standard and mind map combined communication mode used in emergency department to reduce the value of handover defects and adverse events. J Healthc Eng. 2022;8475322:1–6.

Mamalelala TT, Schmollgruber S, Botes M, Holzemer W. 2023. Effectiveness of handover practices between emergency department and intensive care unit nurses. Afr J Emerg Med, 2023, 13(2), pp.72–77.

Zakrison TL, Rosenbloom B, McFarlan A, Jovicic A, Soklaridis S, Allen C, et al. Lost information during the handover of critically injured trauma patients: a mixed-methods study. BMJ Qual Saf. 2016;25(12):929–36.

Download references

Acknowledgements

This article was derived from a master thesis of aging nursing. The authors would like to acknowledge the research deputy at Babol University of medical sciences for their support.

This study was supported by research deputy at Babol University of medical sciences.

Author information

Authors and affiliations.

Student Research Committee, Nursing Care Research Center, Health Research Institute, Babol University of Medical Sciences, Babol, Iran

Atefeh Alizadeh-risani

Nursing Care Research Center, Health Research Institute, Babol University of Medical Sciences, Babol, Iran

Fatemeh Mohammadkhah, Ali Pourhabib & Zahra Fotokian

Department of Critical Care Nursing, School of Nursing and Midwifery, Qazvin University of Medical Sciences, Qazvin, Iran

Marziyeh Khatooni

Correspondence: Zahra Fotokian; Nursing Care Research Center, Health Research Institute, Babol University of Medical Sciences, Babol, Iran

Zahra Fotokian

You can also search for this author in PubMed   Google Scholar

Contributions

All authors contributed to the study conception and design, also all authors read and approved the final manuscript. Atefe Alizadeh-riseni, Zahra Fotokian: Study concept and design, Acquisition of subjects and/or data, Analysis and interpretation of data. Fatemeh Mohammadkhah, Ali Pourhabib: Study design, Analysis and interpretation of data, Preparation of manuscript. Marziyeh Khatooni: Analysis and interpretation of data.

Corresponding author

Correspondence to Zahra Fotokian .

Ethics declarations

Ethical approval and consent to participate.

The Ethics Committee of Babol University of Medical Sciences approved this research proposal (coded under IR.MUBABOL.REC.1401.162). This research was conducted in accordance with the Declaration of Helsinki and all study participants provided written informed consent. The participant rights were preserved (all data were kept anonymous and confidential).

Consent for publication

Competing interests.

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ .

Reprints and permissions

About this article

Cite this article.

Alizadeh-risani, A., Mohammadkhah, F., Pourhabib, A. et al. Comparison of the SBAR method and modified handover model on handover quality and nurse perception in the emergency department: a quasi-experimental study. BMC Nurs 23 , 585 (2024). https://doi.org/10.1186/s12912-024-02266-4

Download citation

Received : 10 June 2024

Accepted : 16 August 2024

Published : 25 August 2024

DOI : https://doi.org/10.1186/s12912-024-02266-4

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • SBAR method
  • Modified handover model
  • Emergency department
  • Nursing perception
  • Patient safety

BMC Nursing

ISSN: 1472-6955

data analysis in experimental research

30 Years of Experimental Education Research in the Post-Soviet Space: A Meta-Analysis of Interventions

Description.

This is supplementary material to the article "30 Years of Experimental Education Research in the Post-Soviet Space: A Meta-Analysis of Interventions". This meta-analysis systematically evaluates the potential of available research in post-Soviet countries as a basis for an evidence-based approach to improving student achievement. The study was conducted on a selection of 41 publications describing educational interventions aimed at improving student achievement. The supplementary material provided here consists of three files that we would like to share with you: Supplementary_file_1 - xlsx database with coded characteristics of all studies and effect sizes included in the analysis; Supplementary_file_2 - a short version of the database in xlsx format with variables used in the multi-level analysis to calculate the pooled effect size and to examine moderators; Supplementary_file_3 - docx file with all R codes used for the analysis in the article.

Information

  • Author Services

Initiatives

You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.

All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .

Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.

Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

Original Submission Date Received: .

  • Active Journals
  • Find a Journal
  • Proceedings Series
  • For Authors
  • For Reviewers
  • For Editors
  • For Librarians
  • For Publishers
  • For Societies
  • For Conference Organizers
  • Open Access Policy
  • Institutional Open Access Program
  • Special Issues Guidelines
  • Editorial Process
  • Research and Publication Ethics
  • Article Processing Charges
  • Testimonials
  • Preprints.org
  • SciProfiles
  • Encyclopedia

applsci-logo

Article Menu

data analysis in experimental research

  • Subscribe SciFeed
  • Recommended Articles
  • Author Biographies
  • Google Scholar
  • on Google Scholar
  • Table of Contents

Find support for a specific problem in the support section of our website.

Please let us know what you think of our products and services.

Visit our dedicated information section to learn more about MDPI.

JSmol Viewer

Anomaly detection for charging voltage profiles in battery cells in an energy storage station based on robust principal component analysis, 1. introduction, 2. source and preprocessing of data, 3. anomaly detection process for battery cells, 3.1. the principle of rpca, 3.2. consistency assessment for battery cells, 3.3. sceening and identification, 4. experimental analysis and verification, 4.1. experimental analysis, 4.2. comparison and verification, 4.3. anomaly reasons and analysis, 5. conclusions, author contributions, institutional review board statement, informed consent statement, data availability statement, conflicts of interest.

  • Zubi, G.; Dufo-López, R.; Carvalho, M.; Pasaoglu, G. The lithium-ion battery: State of the art and future perspectives. Renew. Sustain. Energy Rev. 2018 , 89 , 292–308. [ Google Scholar ] [ CrossRef ]
  • Tarascon, J.M.; Armand, M. Issues and challenges facing rechargeable lithium batteries. Nature 2001 , 414 , 359–367. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Xiong, R.; Sun, W.; Yu, Q.; Sun, F. Research progress, challenges and prospects of fault diagnosis on battery system of electric vehicles. Appl. Energy 2020 , 279 , 115855. [ Google Scholar ] [ CrossRef ]
  • Ghaeminezhad, N.; Ouyang, Q.; Hu, X.; Xu, G.; Wang, Z. Active Cell Equalization Topologies Analysis for Battery Packs: A Systematic Review. IEEE Trans. Power Electron. 2021 , 36 , 9119–9135. [ Google Scholar ] [ CrossRef ]
  • Hong, J.; Wang, Z.; Qu, C.; Zhou, Y.; Shan, T.; Zhang, J.; Hou, Y. Investigation on overcharge-caused thermal runaway of lithium-ion batteries in real-world electric vehicles. Appl. Energy 2022 , 321 , 119229. [ Google Scholar ] [ CrossRef ]
  • Sun, Z.; Wang, Z.; Liu, P.; Zhang, Z.; Wang, S.; Dorrell, D.G. Relative Entropy based Lithium-ion Battery Pack Short Circuit Detection for Electric Vehicle. In Proceedings of the 2020 IEEE Energy Conversion Congress and Exposition (ECCE), Detroit, MI, USA, 11–15 October 2020; pp. 5061–5067. [ Google Scholar ]
  • Jiang, J.; Cong, X.; Li, S.; Zhang, C.; Zhang, W.; Jiang, Y. A Hybrid Signal-Based Fault Diagnosis Method for Lithium-Ion Batteries in Electric Vehicles. IEEE Access 2021 , 9 , 19175–19186. [ Google Scholar ] [ CrossRef ]
  • Yang, B.; Chen, Y.; Guo, Z.; Wang, J.; Zeng, C.; Li, D.; Shu, H.; Shan, J.; Fu, T.; Zhang, X. Levenberg-Marquardt backpropagation algorithm for parameter identification of solid oxide fuel cells. Int. J. Energy Res. 2021 , 45 , 17903–17923. [ Google Scholar ] [ CrossRef ]
  • Li, Z.; Zeng, L. A Hybrid Vertex Outlier Detection Method Based on Distributed Representation and Local Outlier Factor. In Proceedings of the 2015 IEEE 12th Intl Conf on Ubiquitous Intelligence and Computing and 2015 IEEE 12th Intl Conf on Autonomic and Trusted Computing and 2015 IEEE 15th Intl Conf on Scalable Computing and Communications and Its Associated Workshops (UIC-ATC-ScalCom), Beijing, China, 10–14 August 2015; pp. 512–516. [ Google Scholar ]
  • Hu, C.; Jain, G.; Zhang, P.; Schmidt, C.; Gomadam, P.; Gorka, T. Data-driven method based on particle swarm optimization and k-nearest neighbor regression for estimating capacity of lithium-ion battery. Appl. Energy 2014 , 129 , 49–55. [ Google Scholar ] [ CrossRef ]
  • Song, L.; Zhang, K.; Liang, T.; Han, X.; Zhang, Y. Intelligent state of health estimation for lithium-ion battery pack based on big data analysis. J. Energy Storage 2020 , 32 , 101836. [ Google Scholar ] [ CrossRef ]
  • Khaleghi, S.; Firouz, Y.; Van Mierlo, J.; Van den Bossche, P. Developing a real-time data-driven battery health diagnosis method, using time and frequency domain condition indicators. Appl. Energy 2019 , 255 , 113813. [ Google Scholar ] [ CrossRef ]
  • Zeng, J.; Zhang, Y.; Zhang, Z.; Shan, F.; Shen, Z.; Liu, X. Identification of power battery voltage inconsistency faults in electric vehicles based on K-means++ clustering with dynamic k-values. Sci. Sin. Technol. 2023 , 53 , 28–40. [ Google Scholar ]
  • Zhao, Y.; Liu, P.; Wang, Z.; Zhang, L.; Hong, J. Fault and defect diagnosis of battery for electric vehicles based on big data analysis methods. Appl. Energy 2017 , 207 , 354–362. [ Google Scholar ] [ CrossRef ]
  • Li, F.; Min, Y.; Zhang, Y.; Wang, C. A Method for Abnormal Battery Charging Capacity Diagnosis Based on Electric Vehicles Operation Data. Batteries 2023 , 9 , 103. [ Google Scholar ] [ CrossRef ]
  • Piao, C.; Huang, Z.; Su, L.; Lu, S. Research on Outlier Detection Algorithm for Evaluation of Battery System Safety. Adv. Mech. Eng. 2014 , 6 , 830402. [ Google Scholar ] [ CrossRef ]
  • Li, X.; Wang, Z. A novel fault diagnosis method for lithium-Ion battery packs of electric vehicles. Measurement 2018 , 116 , 402–411. [ Google Scholar ] [ CrossRef ]
  • Ma, M.; Wang, Y.; Duan, Q.; Wu, T.; Sun, J.; Wang, Q. Fault detection of the connection of lithium-ion power batteries in series for electric vehicles based on statistical analysis. Energy 2018 , 164 , 745–756. [ Google Scholar ] [ CrossRef ]
  • Wang, Z.; Hong, J.; Zhang, L.; Liu, P. Voltage Fault Detection and Precaution of Batteries Based on Entropy and Standard Deviation for Electric Vehicles. Energy Procedia 2017 , 105 , 2163–2168. [ Google Scholar ] [ CrossRef ]
  • Wang, Z.; Hong, J.; Liu, P.; Zhang, L. Voltage fault diagnosis and prognosis of battery systems based on entropy and Z-score for electric vehicles. Appl. Energy 2017 , 196 , 289–302. [ Google Scholar ] [ CrossRef ]
  • Lin, M.; Xie, H.; Shan, M. A Hybrid Multiscale Permutation Entropy-Based Fault Diagnosis and Inconsistency Evaluation Approach for Lithium Battery of E-Vehicles. IEEE Access 2022 , 10 , 104757–104768. [ Google Scholar ] [ CrossRef ]
  • Yang, D.; Yang, X.; Liao, G.; Zhu, S. Strong Clutter Suppression via RPCA in Multichannel SAR/GMTI System. IEEE Geosci. Remote Sens. Lett. 2015 , 12 , 2237–2241. [ Google Scholar ] [ CrossRef ]
  • Lin, Z.; Chen, M.; Ma, Y. The Augmented Lagrange Multiplier Method for Exact Recovery of Corrupted Low-Rank Matrices. J. Struct. Biol. 2010 , 181 , 116–127. [ Google Scholar ]

Click here to enlarge figure

SpecificationValue
Battery typeLFP
Total voltage (V)761.6
Battery charging termination voltage (V)3.65
Battery discharge termination voltage (V)2.7
Nominal voltage (V)3.2
Nominal capacity (mAh)3000
MethodThe Results on 30 JuneThe Results on 1 July
Average Deviation-3σ3, 21, 33, 53, 58, 1083, 21, 33, 53, 58, 108
Variance-3σ3, 21, 33, 53, 58, 1083, 33, 53, 58, 108
Range-3σ3, 21, 33, 53, 58, 1083, 21, 33, 53, 58, 108
Euclidean Distance-3σ3, 21, 33, 53, 583, 21, 33, 53, 58, 108
Signal Energy-3σ3, 21, 33, 53, 583, 21, 33, 53, 58, 108
The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

Yu, J.; Guo, Y.; Zhang, W. Anomaly Detection for Charging Voltage Profiles in Battery Cells in an Energy Storage Station Based on Robust Principal Component Analysis. Appl. Sci. 2024 , 14 , 7552. https://doi.org/10.3390/app14177552

Yu J, Guo Y, Zhang W. Anomaly Detection for Charging Voltage Profiles in Battery Cells in an Energy Storage Station Based on Robust Principal Component Analysis. Applied Sciences . 2024; 14(17):7552. https://doi.org/10.3390/app14177552

Yu, Jiaqi, Yanjie Guo, and Wenjie Zhang. 2024. "Anomaly Detection for Charging Voltage Profiles in Battery Cells in an Energy Storage Station Based on Robust Principal Component Analysis" Applied Sciences 14, no. 17: 7552. https://doi.org/10.3390/app14177552

Article Metrics

Article access statistics, further information, mdpi initiatives, follow mdpi.

MDPI

Subscribe to receive issue release notifications and newsletters from MDPI journals

  • Frontiers in Thermal Engineering
  • Thermal Science and Energy Systems
  • Research Topics

Coupled Heat Transfer and Applications of Machine Learning

Total Downloads

Total Views and Downloads

About this Research Topic

With the continued development of artificial intelligence technology, machine learning has become an important method of predictive analysis. In particular, coupled heat transfer analysis based on machine learning has significant advantages in terms of reducing experimental costs and modeling time. Through machine learning, researchers can rapidly determine the coupled heat transfer processes between temperature fields, flow fields, and stress fields. Furthermore, machine learning can predict the results of experiments using established models with existing experimental data. This means that researchers can predict heat transfer performance under different conditions without actually experimenting, reducing the cost and time required for experiments. In addition, modeling methods based on machine learning can directly determine the relationship of heat transfer processes by training the models, thus avoiding complex mathematical derivations and calculation processes. This Research Topic aims to explore advances in coupled heat transfer analysis based on machine learning. The goal is to provide a platform for researchers to address challenges in the applications of machine learning in coupled heat transfer and explore innovative solutions. Research areas within the scope of this collection include, but are not limited to: • Intelligent algorithm optimization of coupled heat transfer • Simulation of heat transfer processes • Experimental and simulated verification • Multiphysics coupling analysis • Optimization of heat transfer in complex structures

Keywords : coupled heat transfer, heat transfer processes, machine learning, artificial intelligence, algorithm optimization, simulated verification, experimental verification, multiphysics coupling analysis, Physics-informed neural network, Reacting flow

Important Note : All contributions to this Research Topic must be within the scope of the section and journal to which they are submitted, as defined in their mission statements. Frontiers reserves the right to guide an out-of-scope manuscript to a more suitable section or journal at any stage of peer review.

Topic Editors

Topic coordinators, submission deadlines.

Manuscript Summary
Manuscript

Participating Journals

Manuscripts can be submitted to this Research Topic via the following journals:

total views

  • Demographics

No records found

total views article views downloads topic views

Top countries

Top referring sites, about frontiers research topics.

With their unique mixes of varied contributions from Original Research to Review Articles, Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author.

  • Open access
  • Published: 21 August 2024

Association between gynecologic cancer and Alzheimer’s disease: a bidirectional mendelian randomization study

  • Di Cao 1 , 2   na1 ,
  • Shaobo Zhang 3   na1 ,
  • Yini Zhang 1 , 2 ,
  • Ming Shao 4 , 5 ,
  • Qiguang Yang 6 &
  • Ping Wang 1 , 2  

BMC Cancer volume  24 , Article number:  1032 ( 2024 ) Cite this article

280 Accesses

7 Altmetric

Metrics details

Alzheimer’s disease (AD) manifests with a higher rate of occurrence in women. Previous epidemiological studies have suggested a potential association between AD and gynecological cancers, but the causal relationship between them remains unclear. This study aims to explore the causal link between 12 types of gynecological cancers and AD using a bidirectional Mendelian randomization (MR) approach.

We obtained genetic correlation tools for AD using data from the most extensive genome-wide association study. Genetic correlation data for 12 types of gynecological cancers were also sourced from the Finnish Biobank. These cancers include breast cancer (BC), cervical adenocarcinoma (CA), cervical squamous cell carcinoma (CSCC), cervical cancer (CC), endometrial cancer (EC), ovarian endometrioid carcinoma (OEC), ovarian cancer (OC), ovarian serous carcinoma (OSC), breast carcinoma in situ (BCIS), cervical carcinoma in situ (CCIS), endometrial carcinoma in situ (ECIS), and vulvar carcinoma in situ (VCIS). We used the inverse-variance weighted (IVW) model for causal analysis and conducted horizontal pleiotropy tests, heterogeneity tests, MR-PRESSO tests, and leave-one-out analyses to ensure the robustness of our results. We also applied replication analysis and meta-analysis to further validate our experimental results.

The study found that EC ( P _IVW =0.037, OR [95% CI] = 1.032 [1.002, 1.064]) and CCIS ( P _IVW = 0.046, OR [95% CI] = 1.032 [1.011, 1.064]) increase the risk of AD, whereas OC was negatively correlated with AD ( P _IVW = 0.016, OR [95% CI] = 0.974[0.954, 0.995]). In reverse MR analysis, AD increased the risk of CC ( P _IVW = 0.039, OR [95% CI] = 1.395 [1.017, 1.914]) and VCIS ( P _IVW = 0.041, OR [95% CI] = 1.761 [1.027, 2.021]), but was negatively correlated with OEC ( P _IVW = 0.034, OR [95% CI] = 0.634 [0.417, 0.966]). Sensitivity analysis results demonstrated robustness. These findings were further substantiated through replication and meta-analyses.

Conclusions

Our MR study supports a causal relationship between AD and gynecological cancers. This encourages further research into the incidence of gynecological cancers in female Alzheimer’s patients and the active prevention of AD.

Peer Review reports

Gynecological cancer refers to cancers that originate in the female reproductive system [ 1 ]. Since the incidence of breast cancer in women is higher than in men, this article classifies breast cancer as a form of gynecological cancer, a perspective shared by some research studies [ 2 , 3 ]. Breast cancer, cervical cancer, ovarian cancer, and endometrial cancer, as common malignancies among women, have seen a gradual increase in incidence rates in recent years, with the age of onset trending towards younger populations [ 4 ]. According to the latest global cancer statistics from 2020, the incidence rate of breast cancer in women has surpassed that of lung cancer, making it the most common cancer worldwide [ 5 ]. In 2020, there were 2.3 million new cases of breast cancer globally, accounting for 11.7% of all cancer cases, with breast cancer deaths comprising 6.9% of all cancer-related deaths. New cases of cervical cancer were 0.60 million, representing 3.1% of all cancer cases; endometrial cancer had 0.42 million new issues, accounting for 2.2%; and ovarian cancer had nearly 0.31 million new cases, making up 1.6% of all topics [ 6 ]. These figures indicate a substantial economic burden on society [ 7 ]. For gynecological cancers, particularly recurrent and advanced stages, traditional standard treatments often leave much to be desired [ 8 ]. While developing new treatment strategies is crucial, preventing these diseases is also important and is increasingly recognized as a priority by the public.

Alzheimer’s Disease [AD] is a common neurodegenerative condition in the geriatric population [ 9 ], with studies finding a higher prevalence in women [ 10 , 11 , 12 ]. Epidemiological evidence supports an inverse relationship between the incidence of cancer and AD [ 13 , 14 ] – a diagnosis of cancer reduces the risk of developing AD, and vice versa [ 15 ]. However, does this apply to all cancers? Research indicates that women with breast cancer have a significantly increased risk of early-onset Alzheimer’s and related dementias [ADRD] [ 16 ]. Additionally, older breast cancer survivors exhibiting age-related phenotypes and genotypes may face increased risks of cognitive decline [ 17 ]. Studies have also found that breast cancer patients carrying the APOE4 allele experience declines in memory, attention, and learning abilities for an extended period post-treatment [ 18 ]. In a study involving over six million women, 36,131 breast cancer patients and 3019 cervical cancer patients were found to have early-onset ADRD [ 19 ]. Moreover, AD patients tend to be diagnosed with gynecological cancers at a later stage, when the disease is more severe, often missing the optimal treatment window [ 20 ]. Due to the limitations of clinical cancer research and the communication challenges posed by cognitive impairments in AD patients, establishing a causal relationship between the two conditions is challenging.

Like randomized controlled trials, Mendelian randomization [MR] studies involve the random allocation of genes during embryonic development, similar to the random assignment of interventions at the start of a trial [ 21 ]. This method reduces the impact of confounding factors. It overcomes common causality issues in observational epidemiological studies, avoiding the high costs, ethical concerns, feasibility, and experimental environment issues associated with the randomized controlled trial [ 22 , 23 ]. Hence, this study designed a bidirectional MR study using single nucleotide polymorphisms [SNPs] as instrumental variables [IVs] to explore the bidirectional causal effects between gynecological tumors and AD, offering new insights into preventing and treating AD and cancer.

Study design

To explore the causal relationship between gynecologic cancer and AD, the present study conducted a bidirectional MR analysis. Figure  1 illustrates our study’s methodology and process. The selected genetic IVs need to satisfy three assumptions of MR analysis [ 24 ]. First, genetic variations are assumed to be closely related to the exposure event. Second, there is no relation between gynecologic cancer, AD, and confounding factors. Finally, genetic variations are assumed to directly influence disease outcomes through exposure factors, excluding other pathways. We derived our genetic instruments for exposure and outcome from publicly available genome-wide association study [GWAS] summary statistics. As all ethical aspects have been addressed in the original research, our study requires no additional ethical approval. This study follows the STROBE-MR writing guidelines.

figure 1

Flowchart of the bidirectional Mendelian randomization study. Abbreviations MR, Mendelian randomization; SNP, Single nucleotide polymorphism; GWAS, genome-wide association study; LD, linkage disequilibrium

Data sources

The 12 selected common cancers in women for this study, including breast cancer (BC), cervical adenocarcinoma (CA), cervical squamous cell carcinoma (CSCC), cervical cancer (CC), endometrial cancer (EC), ovarian endometrioid carcinoma (OEC), ovarian cancer (OC), ovarian serous carcinoma (OSC), breast carcinoma in situ (BCIS), cervical carcinoma in situ (CCIS), endometrial carcinoma in situ (ECIS), and vulvar carcinoma in situ (VCIS), were analyzed using IVs from the Finnish Cancer Registry (R9) in the Finnish Biobank. The Finnish Biobank was established in the early 20th century and contains data related to healthcare, genetics, familial inheritance, demographics, education, employment, and other aspects, offering high research value and practicality [ 25 ]. The AD whole-genome dataset is derived from the European Alzheimer’s Disease Biobank (EADB) Alliance, which consists of 39,106 samples and 46,828 controls, with a sample size of 487,511 and 20,921,626 SNPs [ 26 ]. All participants are of European ancestry. More details can be found in Table  1 .

Selection and evaluation of IVs

In the forward MR analysis, gynecologic cancer is considered the exposure factor, and AD is the outcome event. The IVs associated with gynecologic cancer should demonstrate genome-wide significance ( P  < 5e-06). Initially, we selected genetic variations adhering to the criterion of P  < 5e-08. However, under these stringent conditions, no available SNPs were available for cancers such as CA and CSCC. In the reverse MR analysis, AD is treated as the exposure and gynecologic cancer as the outcome, with AD-related IVs meeting the P  < 5e-08 threshold. In the bidirectional MR analysis, to obtain independent SNPs, the linkage disequilibrium (LD) parameters (R^2) should be < 0.001 and kb = 10,000. The F-statistic represents the strength of the MR analysis, with a value greater than 10 indicating statistical robustness [ 27 ]. The formula for calculating the F-statistic is R^2(N-2)/(1-R^2) [ 28 ]. It is essential to exclude SNPs associated with confounding factors. We utilized Phenoscanner (version 2, accessed on October 30, 2023) to eliminate SNPs linked to potential confounders. The filtered SNPs will serve as the IVs for our study.

MR analysis

MR studies investigating the relationship between exposure and outcome primarily use the inverse-variance weighted (IVW) method because it can obtain a robust result without pleiotropy [ 29 ]. MR-Egger, Simple Mode, weighted median (WM), and Weighted Mode methods are used as supplementary methods to assess the robustness of the primary analysis.

We conducted various sensitivity analyses to ensure the robustness of the outcomes obtained from the bi-directional MR. Sensitivity analysis includes horizontal pleiotropy test, heterogeneity test, MR-PRESSO test, and leave-one-out analysis. The horizontal pleiotropy test is performed by MR-Egger regression. If a significant intercept term is found in MR-Egger analysis, it indicates the presence of horizontal pleiotropy [ 30 , 31 ]. Cochran’s Q test is used to assess the heterogeneity of SNPs. If Cochran’s Q statistic is statistically significant ( P  ≤ 0.05), it suggests considerable heterogeneity in the analysis results. MR-PRESSO test using MR pleiotropy residual sum and outlier test is used to detect outliers [ 32 ]. If outliers are detected, they will be removed, and the remaining IVs will be reanalyzed. Leave-one-out analysis is used to evaluate whether a single SNP determines significant results [ 33 ]. The risk association between gynecologic cancer and AD is expressed as an odds ratio [OR] and 95% confidence interval (CI). If P  ≤ 0.05, it provides evidence for a possible causal relationship. We used the Steiger test to perform directionality testing to avoid biases caused by reverse causation. Analysis was conducted using R 4.3.1, utilizing several packages, including TwoSampleMR, ggplot2, and MRPRESSO.

Confirmatory analysis and meta-analysis

To ensure the reliability of our study results, we conducted a replication validation using an additional AD GWAS dataset from the GWAS Catalog, with accession number GCST007320 [ 34 ]. This dataset includes 71,880 cases and 383,378 controls, all of European ancestry. We applied this AD dataset to conduct a bidirectional MR analysis with 12 types of gynecological cancers. The selection process of IVs, the MR analysis standards, and sensitivity testing methods were consistent with the initial analysis. We performed a meta-analysis to combine the IVW results from the replication and initial analyses that showed causal associations. The choice of effect model was based on the heterogeneity of the results. When heterogeneity was not significant, a fixed-effect model was used; otherwise, a random-effect model was applied [ 35 ]. The meta-analysis was performed using the meta package and Review Manager 5.3.

Causal effect of gynecologic cancer on AD

In the forward MR analysis, we included a total of 78 independent SNPs associated with BC, 2 independent SNPs associated with CA, 4 independent SNPs associated with CSCC, 6 independent SNPs associated with CC, 11 independent SNPs associated with EC, 5 independent SNPs associated with OEC, 6 independent SNPs associated with OC, 11 independent SNPs associated with OSC, seven independent SNPs associated with BCIS, 12 independent SNPs associated with CCIS, one independent SNP associated with ECIS, and 3 independent SNPs associated with VCIS (Supplementary Table 1 ). Importantly, all IVs exhibited F-statistics well above 10, ranging from 256.345 to 34274.379, which indicates a low risk of bias and supports fulfilling the strong instrumental assumptions required for MR (Supplementary Table 3 ).

The IVW method showed that when AD was the outcome factor, EC could increase the risk of AD ( P _IVW = 0.037, OR [95% CI] = 1.032 [1.002, 1.064]), while OC could suppress the risk of AD ( P _IVW = 0.016, OR [95% CI] = 0.974 [0.954, 0.995]), and CCIS could promote the risk of AD ( P _IVW = 0.046, OR [95% CI] = 1.032 [1.011, 1.064]) (Fig.  2 [A]). The calculation results of MR Egger, WM, Simple mode, and Weighted mode were consistent with the direction of the IVW results, indicating the robustness and reliability of the primary analysis methods. There was no causal relationship between other common tumors in women and AD.

figure 2

Forest plots depicting the causal estimates between gynecological cancer and AD. ( A ) Forward MR analysis forest plot, with gynecological cancer as the exposure event and AD as the outcome event. ( B ) Reverse MR analysis forest plot, with AD as the exposure event and gynecological cancer as the outcome event. Abbreviations N SNPs, number of SNPs; OR, odds ratio; CI, confidence interval; AD, Alzheimer’s disease; BC, breast cancer; CA, cervical adenocarcinoma; CSCC, cervical squamous cell carcinoma; CC, cervical cancer; EC, endometrial cancer; OEC, ovarian endometrioid carcinoma; OC, ovarian cancer, OSC, ovarian serous carcinoma; BCIS, breast carcinoma in situ; CCIS, cervical carcinoma in situ; ECIS, endometrial carcinoma in situ; VCIS, vulvar carcinoma in situ; IVW, Inverse-variance weighted

Causal effect of AD on gynecologic cancer

In the reverse MR analysis, focusing on gynecologic cancer as the outcome, we observed a definitive causal link between AD and conditions such as CC, VCIS, and OEC. Specifically, AD appears to elevate the risk for CC ( P _IVW = 0.039, OR [95% CI] = 1.395 [1.017, 1.914]) and VCIS ( P _IVW = 0.041, OR [95% CI] = 1.761 [1.027, 2.021]). At the same time, it conversely reduces the risk for OEC ( P _IVW = 0.034, OR [95% CI] = 0.634 [0.417, 0.966]), as illustrated in (Fig.  2 [B]) and detailed in Supplementary Table 2 . Notably, no causal links were found between AD and other common tumors in females. The consistency of these findings across various analytical methods, aligning with the direction of the IVW results, underscores their reliability. Furthermore, all IVs exhibit an F-statistic significantly above 10 (Supplementary Tables 4 to 15 ), suggesting a minimal influence of weak instrument bias on the MR analysis.

Sensitivity analyses results

In addition, sensitivity analysis using the leave-one-out method in both forward and reverse MR analyses showed no evidence of directional pleiotropy (Fig.  3 ). The funnel plot revealed no evidence of asymmetry, indicating a lower risk of directional pleiotropy. MR-Egger regression testing with all P -values > 0.05 showed no directional pleiotropy between gynecologic cancer and AD. Cochran’s Q statistic showed no significant heterogeneity among instrumental SNP effects ( P  > 0.05). Furthermore, MR-PRESSO results demonstrated no statistically substantial outliers or influential points ( P  > 0.05), suggesting no significant interference or bias was found when evaluating the relationship between exposure and outcome (Table  2 ). The Supplementary Figs.  1 – 23 present scatter plots, funnel plots, leave-one-out sensitivity analyses and forest plots.

figure 3

The leave-one-out plot of SNPs associated with gynecological cancer and AD. ( A ) Forward MR leave-one-out sensitivity analysis for the ‘EC’ on ‘Alzheimer’s disease’. ( B ) Forward MR leave-one-out sensitivity analysis for the ‘CCIS’ on ‘Alzheimer’s disease’. ( C ) Forward MR leave-one-out sensitivity analysis for the ‘OC’ on ‘Alzheimer’s disease’. ( D ) Reverse MR leave-one-out sensitivity analysis for the ‘Alzheimer’s disease’ on ‘CC’. ( E ) Reverse MR leave-one-out sensitivity analysis for the ‘Alzheimer’s disease’ on ‘VCIS’. ( F ) Reverse MR leave-one-out sensitivity analysis for the ‘Alzheimer’s disease’ on ‘OEC’. Abbreviations CC, cervical cancer; EC, endometrial cancer; OEC, ovarian endometrioid carcinoma; OC, ovarian cancer, CCIS, cervical carcinoma in situ; VCIS, vulvar carcinoma in situ

Validation analysis and meta-analysis

After applying another GWAS data for AD (accession number GCST007320) and conducting bidirectional MR analysis, we observed trends similar to those found in the preliminary analysis. Specifically, EC ( P _IVW = 0.030, OR [95% CI] = 1.013 [1.011,1.016]) and CCIS ( P _IVW = 0.001, OR [95% CI] = 1.007 [1.006,1.008]) were found to possibly increase the risk of AD, while OC ( P _IVW = 0.045, OR [95% CI] = 0.997 [0.995,0.999]) could reduce the risk of AD. Reverse MR analysis indicated that AD might increase the disease risk for CC ( P _IVW = 0.038, OR [95% CI] = 2.257 [1.592,3.199]) and VCIS ( P _IVW = 0.008, OR [95% CI] = 2.210 [2.047,2.386]), and decrease the onset risk for OEC ( P _IVW = 0.015, OR [95% CI] = 0.798 [0.735,0.866]). Sensitivity analysis showed no irregularities. Additionally, meta-analysis of the OR results from two IVW instances further reinforced this impression. Meta-analyzing the OR results from two IVW rounds further confirmed this impression. Details in Fig.  4 and Supplementary Tables 16 to 17 . The scatter plots, funnel plots, leave-one-out sensitivity analysis, and forest plots for the replication MR analyses can be found in Supplementary Figs.  24 – 46 .

figure 4

Meta-analysis of causal associations ( P _IVW < 0.05) between gynecological cancer and AD. ( A ) Meta-analysis of OR [95% CI] for EC with AD as the outcome. ( B ) Meta-analysis of OR [95% CI] for OC with AD as the outcome. ( C ) Meta-analysis of OR [95% CI] for CCIS with AD as the outcome. ( D ) Meta-analysis of OR [95% CI] for CC with AD as the exposure. ( E ) Meta-analysis of OR [95% CI] for VICS with AD as the exposure. ( F ) Meta-analysis of OR [95% CI] for OEC with AD as the exposure. Abbreviations CC, cervical cancer; EC, endometrial cancer; OEC, ovarian endometrioid carcinoma; OC, ovarian cancer, CCIS, cervical carcinoma in situ; VCIS, vulvar carcinoma in situ; 95% CI, 95% confidence interval; OR, odds ratio

In today’s society, as women’s roles and importance in social life grow, so does the incidence of common tumors in women, influenced by their unique physiological makeup and hormone levels. In addition, epidemiological data shows that AD, with a tendency to affect women [ 36 ], presents significant challenges to society and families. Currently, there is a lack of solid scientific evidence linking AD with common cancers in women. We have initiated a bidirectional MR study, employing comprehensive GWAS summary statistics, to investigate potential causal links between common female cancers and AD.

Specifically, our research, utilizing forward MR analysis, suggests that individuals with EC and CCIS may be at a higher risk of developing AD. In comparison, those with OC may have a lower risk. Other common female cancers, like BC, seem not to impact the risk of developing AD. Although current research has shown that cancer survivors have a reduced incidence of AD [ 37 ], there are also studies demonstrating a connection between EC and AD. One study found that the expression of SERPINA3 in EC is associated with disease progression, poor differentiation, high malignancy, and advanced stages of cancer [ 38 ], especially in cells expressing negative estrogen receptors (ER). Increased expression of SERPINA3 was observed in these ER-negative cells. Suppressing the presentation of the SERPINA3 gene can inhibit the proliferation of cancer cells. However, SERPINA3 also plays a crucial role in the development of AD [ 39 ], with elevated levels of SERPINA3 protein found in the blood, brain [including the hippocampus], and cerebrospinal fluid of AD patients [ 40 ]. Analysis has shown that one of the components of amyloid plaques in AD is the SERPINA3 protein and an increase in the levels of SERPINA3 protein in the cerebrospinal fluid may be indicative of mild cognitive impairment in the progression of AD [ 41 ]. Additionally, EC and AD are interconnected through multiple common pathways, such as the mTOR signaling network [ 42 ] and G-protein-coupled receptors (GPCRs) [ 43 ], which play significant roles in the pathology of both diseases. THOP1 (Thimet oligopeptidase), a neuropeptide processing enzyme, was observed to have significantly increased in AD brain tissue as a neuroprotective response to Aβ toxicity [ 44 , 45 ]. However, in a comparative transcriptome analysis, researchers observed that the expression of the THOP1 gene was significantly downregulated as EC progressed to its late stages, weakening its neuroprotective effect on AD [ 46 ].

CCIS, also known as grade 3 cervical intraepithelial neoplasia [ 47 ], had approximately 30–50% of cases potentially progressing to cancer [ 48 ]. CCIS is mainly associated with human papillomavirus (HPV) infection [ 49 ]. Our research findings indicated that CCIS might increase the risk of developing AD. A preliminary connection between the two diseases was identified through a detailed comparison of their pathology, etiology, and biological mechanisms. HPV may play a latent role in AD development, especially concerning inflammation and oxidative stress. Research using a systems biology approach has discovered that HPV interacts with several crucial genes linked to AD, like EGFR, APOE, APP, and CASP8 [ 50 ]. Research further indicated that HPV could disrupt the mucosal barrier and modify immune reactions, leading to the dissemination of invasive yeast into the brain, initiating inflammatory cytokines, and thus facilitating the generation of Aβ protein, indirectly leading to AD [ 51 ]. Additionally, machine learning studies have identified HPV-71 (OR = 3.56, P  = 0.02) as a potential risk factor for AD [ 52 ]. This indicates the requirement for more comprehensive research to investigate the association between CCIS and AD and explain their underlying biological mechanisms.

Furthermore, this study found that OC may increase the risk of AD, a link that could be associated with the multifunctional protein BAG3. BAG3 is involved in the regulation of various cellular processes, such as apoptosis, development, and selective autophagy [ 53 ]. It has a significant impact on the development of both OC and AD. In OC, BAG3 enhances the invasive capabilities of tumor cells by interacting with matrix metalloproteinase-2, a calcium-dependent peptidase involved in extracellular matrix remodeling [ 54 ]. It also promotes cancer cell proliferation by interacting with the 3’-untranslated region of Skp2 mRNA, countering the suppressive effects of miR-21-5p on Skp2 expression, thereby bolstering the survival capacity of tumor cells [ 55 ]. Simultaneously, although miR-340 inhibits the survival and promotes the apoptosis of OC cells by downregulating BAG3, the overexpression of BAG3 effectively counteracts these effects and further accelerates tumor development by activating the PI3K/AKT signaling pathway [ 56 ]. However, in patients with AD, BAG3 plays a neuroprotective role. Research demonstrated that specifically removing BMAL1, a protein involved in circadian rhythms, from a mouse model activated astrocytes and stimulated BAG3 expression. The increased expression of BAG3 allowed astrocytes to more efficiently consume αSyn and tau, diminishing their activity in AD models, thereby assisting in managing the balance of neurotoxic proteins during AD progression [ 57 ]. Furthermore, research revealed that increasing BAG3 expression, under proteasome inhibition, promoted the degradation of tau in neurons and decreased phosphorylated tau levels [ 58 ]. In clinical research, a large cross-sectional study found that patients diagnosed with OC had a lower risk of developing AD upon discharge (multivariate OR [95% CI] = 0.35 [0.30–0.41]) [ 59 ], indicating a strong negative correlation. OC treatment often involves oophorectomy, and a Danish prospective study found that dementia incidence increased by 18% following bilateral oophorectomy, while it decreased by 13% after unilateral oophorectomy [ 60 ]. This could be related to the everyday use of hormone replacement therapy (HRT) post-oophorectomy, and epidemiological studies have found that estrogen has a protective effect against AD [ 61 , 62 , 63 ]. This may be a reason why OC survivors are less likely to develop AD.

Our reverse MR study results revealed that AD may heighten the risk for CC and VCIS while possibly lowering the risk for OEC. CC, as one of the common malignancies leading to female mortality [ 64 ], is generally preventable through early screening and treatment. However, the probability of dementia patients undergoing the Papanicolaou smear test (PST) for CC prevention is lower than that of the general population [ 65 ]. Simultaneously, epidemiological studies reveal a higher incidence of VCIS and OEC in middle-aged and elderly individuals, with a predominance of middle-aged women [ 66 , 67 ]. Considering that a majority of older women may exhibit increased tolerance to diseases due to factors such as age, lifestyle convenience, and cognitive decline, there is a reduction in regular health check-ups and cancer screenings. Timely CC screenings can effectively prevent such diseases. The incidence of CC was highly associated with high-risk HPV infections, which also caused abnormal proliferation of vulvar cells, increasing the risk of carcinogenesis [ 68 ]. Vaccination against HPV effectively reduced the risk of VICS and CC [ 69 ]. Following a diagnosis of AD, the risk of misdiagnosis or delayed diagnosis when encountering other conditions such as CC, VCIS, and OEC is heightened due to the decline in cognitive function and expressive abilities.

In recent years, increasing research has identified AD as primarily an autoimmune disease occurring within the brain [ 70 ]. The immune system in AD patients has undergone various changes; a study based on the Healthy Aging in Neighborhoods of Diversity across the Life Span study found a significant correlation between the rate of decline in the immune system and the rate of cognitive decline, with poorer immune function associated with worse cognitive abilities [ 71 ]. The immune system is closely linked to the incidence of CC and VCIS, and immune suppression has been established as a risk factor for CC [ 72 ]. HPV infection is one of the primary causes of many gynecological cancers; it may evade host immune surveillance, leading to CC and VCIS [ 73 , 74 ]. Consequently, the compromised immune system function facilitates persistent HPV infections, likely a critical factor in AD patients’ higher risk of developing CC and VCIS. Female AD patients often experience reduced estrogen levels [ 75 , 76 ], and excessive estrogen secretion is a common cause of OEC [ 77 ]; therefore, AD patients may reduce their risk of OEC through the estrogen pathway. Mutations in the PTEN gene are involved in the development of both AD and OEC [ 78 ]; PTEN, a tumor suppressor gene, regulates the proliferation and differentiation of neural stem cells in the nervous system, affecting neural regeneration [ 79 ]. In AD patients, PTEN often shows a decrease and distribution change [ 80 ]. PTEN deletion is also a common driving factor for OEC [ 81 , 82 ], with patients carrying a PTEN expression deficit experiencing worse outcomes [ 83 ]. Therefore, further research is needed to verify whether AD provides a protective effect against OEC. Our findings suggest that future considerations could include enhanced screening and prevention of CC and VCIS among AD patients.

Our bidirectional MR study results have circumvented the issues of reverse causality and confounding biases encountered in traditional observational studies [ 84 ]. Simultaneously, it has also overcome the inconvenience associated with clinical observations of cancer and cognitive impairment patients. This study marks the first attempt to explore the causal relationship between AD patients and gynecological cancer, providing a novel research perspective on the prevention of both diseases. CC and CCIS represent two types of cancer occurring in the cervix at different stages, OC is a general term for malignant ovarian tumors, and OEC represents a subtype of these tumors. Due to the heterogeneity of cancer, which leads to diversity and variability at different stages and types, our study aimed to include as many current classifications of gynecological cancers as possible. Our results also showed that the causal relationship between AD and tumors of different natures and degrees, even those occurring in the same location, varies. However, our study does have certain limitations. In the reverse MR analysis, our conclusions indicate that AD patients are at risk of developing CC and VCIS, while there is a negative correlation between AD and OEC incidence. Nevertheless, there is a lack of direct clinical observation studies to confirm this standpoint. Additionally, the IVs for gynecological cancer were uniformly sourced from the Finnish database to ensure data consistency. Although the results from all methodologies exhibit a degree of robustness, the limitation lies in the relatively small sample size and the limited number of available IVs in our study. A replication MR analysis using an additional set of AD’s GWAS data further validated the reliability of our study findings. Furthermore, our analysis primarily focuses on the European population, necessitating caution when generalizing the research findings to other populations.

In conclusion, our study findings support the hypothesis of a causal relationship between AD and certain gynecological cancers. However, to validate this study’s results, we recommend including a more extensive dataset from gynecological cancer GWAS and incorporating additional genetic IVs. We encourage more researchers to investigate the relationship between female AD patients and the incidence of gynecological cancer and to continue in-depth research in this field.

Data availability

Data is provided within the manuscript or supplementary information files. The data for 12 gynecological cancers is sourced from the FinnGen database, and the dataset link is https://figshare.com/articles/dataset/Gynecological_cancer_application_data/24980757 . The GWAS data for Alzheimer’s disease is sourced from IEU open GWAS and GWAS catalog, and the relevant dataset can be obtained from the following link: https://gwas.mrcieu.ac.uk/datasets/ebi-a-GCST90027158/ , https://ftp.ebi.ac.uk/pub/databases/gwas/summary_statistics/GCST007001-GCST008000/GCST007320/ .

Abbreviations

  • Alzheimer’s disease
  • Mendelian randomization

Breast cancer

Cervical adenocarcinoma

Cervical squamous cell carcinoma

Cervical cancer

Endometrial cancer

Ovarian endometrioid carcinoma

Ovarian cancer

Ovarian serous carcinoma

Breast carcinoma in situ

Cervical carcinoma in situ

Endometrial carcinoma in situ

Vulvar carcinoma in situ

Inverse-variance weighted

Single nucleotide polymorphism

Instrumental variables

  • Genome-wide association study

European Alzheimer’s Disease Biobank

Linkage disequilibrium

Weighted median

Confidence interval

Estrogen receptors

G-protein-coupled receptors

Human papillomavirus

Hormone replacement therapy

Papanicolaou smear test

Ledford LRC, Lockwood S. Scope and epidemiology of gynecologic cancers: an overview. Semin Oncol Nurs. 2019;35(2):147–50.

Article   PubMed   Google Scholar  

Kim M, Suh DH, Lee KH, Eom KY, Toftdahl NG, Mirza MR, et al. Major clinical research advances in gynecologic cancer in 2018. J Gynecol Oncol. 2019;30(2):e18.

Article   PubMed   PubMed Central   Google Scholar  

Kim M, Suh DH, Lee KH, Eom KY, Lee JY, Lee YY, et al. Major clinical research advances in gynecologic cancer in 2019. J Gynecol Oncol. 2020;31(3):e48.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Waldmann A, Eisemann N, Katalinic A. Epidemiology of malignant cervical, Corpus Uteri and Ovarian tumours - Current Data and Epidemiological trends. Geburtshilfe Frauenheilkd. 2013;73(2):123–9.

Arnold M, Morgan E, Rumgay H, Mafra A, Singh D, Laversanne M, et al. Current and future burden of breast cancer: global statistics for 2020 and 2040. Breast Edinb Scotl. 2022;66:15–23.

Article   Google Scholar  

Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, et al. Global Cancer statistics 2020: GLOBOCAN estimates of incidence and Mortality Worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2021;71(3):209–49.

Stewart SL, Lakhani N, Brown PM, Larkin OA, Moore AR, Hayes NS. Gynecologic cancer prevention and control in the National Comprehensive Cancer Control Program: progress, current activities, and future directions. J Womens Health 2002. 2013;22(8):651–7.

Keyvani V, Riahi E, Yousefi M, Esmaeili SA, Shafabakhsh R, Moradi Hasan-Abad A, et al. Gynecologic cancer, cancer stem cells, and possible targeted therapies. Front Pharmacol. 2022;13:823572.

Tahami Monfared AA, Byrnes MJ, White LA, Zhang Q. Alzheimer’s Disease: Epidemiology and Clinical Progression. Neurol Ther. 2022;11(2):553–69.

Chan KY, Wang W, Wu JJ, Liu L, Theodoratou E, Car J, et al. Epidemiology of Alzheimer’s disease and other forms of dementia in China, 1990–2010: a systematic review and analysis. Lancet Lond Engl. 2013;381(9882):2016–23.

Niu H, Álvarez-Álvarez I, Guillén-Grima F, Aguinaga-Ontoso I. Prevalence and incidence of Alzheimer’s disease in Europe: a meta-analysis. Neurol Barc Spain. 2017;32(8):523–32.

CAS   Google Scholar  

Nebel RA, Aggarwal NT, Barnes LL, Gallagher A, Goldstein JM, Kantarci K, et al. Understanding the impact of sex and gender in Alzheimer’s disease: a call to action. Alzheimers Dement J Alzheimers Assoc. 2018;14(9):1171–83.

Sherzai AZ, Parasram M, Haider JM, Sherzai D. Alzheimer disease and cancer: a national inpatient sample analysis. Alzheimer Dis Assoc Disord. 2020;34(2):122.

Article   CAS   PubMed   Google Scholar  

Ospina-Romero M, Glymour MM, Hayes-Larson E, Mayeda ER, Graff RE, Brenowitz WD, et al. Association between alzheimer disease and cancer with evaluation of study biases. JAMA Netw Open. 2020;3(11):e2025515.

Dong Z, Xu M, Sun X, Wang X. Mendelian randomization and transcriptomic analysis reveal an inverse causal relationship between Alzheimer’s disease and cancer. J Transl Med. 2023;21(1):527.

Du XL, Song L, Schulz PE, Xu H, Chan W. Risk of developing alzheimer’s disease and related dementias in association with cardiovascular disease, stroke, hypertension, and diabetes in a large cohort of women with breast cancer and with up to 26 years of follow-up. J Alzheimers Dis. 2022;87(1):415–32.

Mandelblatt JS, Small BJ, Luta G, Hurria A, Jim H, McDonald BC, et al. Cancer-related cognitive outcomes among older breast cancer survivors in the thinking and living with cancer study. J Clin Oncol. 2018;36(32):3211–22.

Article   CAS   PubMed Central   Google Scholar  

Lehrer S, Rheinstein PH. Breast Cancer, Alzheimer’s Disease, and APOE4 allele in the UK Biobank Cohort. J Alzheimers Dis Rep. 2021;5(1):49–53.

Xu WY, Raver E, Jung J, Li Y, Thai G, Lee S. Rural-urban disparities in preventive breast and cervical cancer screening among women with early-onset dementia. BMC Womens Health. 2023;23(1):255.

Gorin SS, Heck JE, Albert S, Hershman D. Treatment for breast cancer in patients with Alzheimer’s disease. J Am Geriatr Soc. 2005;53(11):1897–904.

Lawlor DA, Harbord RM, Sterne JAC, Timpson N, Davey Smith G. Mendelian randomization: using genes as instruments for making causal inferences in epidemiology. Stat Med. 2008;27(8):1133–63.

Burgess S, Thompson SG. Bias in causal estimates from mendelian randomization studies with weak instruments. Stat Med. 2011;30(11):1312–23.

Ziegler A, Mwambi H, König IR. Mendelian randomization versus path models: making Causal inferences in genetic epidemiology. Hum Hered. 2015;79(3–4):194–204.

Emdin CA, Khera AV, Kathiresan S, Mendelian Randomization. JAMA. 2017;318(19):1925–6.

Kurki MI, Karjalainen J, Palta P, Sipilä TP, Kristiansson K, Donner KM, et al. FinnGen provides genetic insights from a well-phenotyped isolated population. Nature. 2023;613(7944):508–18.

Bellenguez C, Küçükali F, Jansen IE, Kleineidam L, Moreno-Grau S, Amin N, et al. New insights into the genetic etiology of Alzheimer’s disease and related dementias. Nat Genet. 2022;54(4):412–36.

Pierce BL, Ahsan H, Vanderweele TJ. Power and instrument strength requirements for mendelian randomization studies using multiple genetic variants. Int J Epidemiol. 2011;40(3):740–52.

Palmer TM, Lawlor DA, Harbord RM, Sheehan NA, Tobias JH, Timpson NJ, et al. Using multiple genetic variants as instrumental variables for modifiable risk factors. Stat Methods Med Res. 2012;21(3):223–42.

Burgess S, Scott RA, Timpson NJ, Davey Smith G, Thompson SG, EPIC- InterAct Consortium. Using published data in mendelian randomization: a blueprint for efficient identification of causal risk factors. Eur J Epidemiol. 2015;30(7):543–52.

Bowden J, Davey Smith G, Burgess S. Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression. Int J Epidemiol. 2015;44(2):512–25.

Burgess S, Thompson SG. Interpreting findings from mendelian randomization using the MR-Egger method. Eur J Epidemiol. 2017;32(5):377–89.

Verbanck M, Chen CY, Neale B, Do R. Detection of widespread horizontal pleiotropy in causal relationships inferred from mendelian randomization between complex traits and diseases. Nat Genet. 2018;50(5):693–8.

Burgess S, Bowden J, Fall T, Ingelsson E, Thompson SG. Sensitivity analyses for robust causal inference from mendelian randomization analyses with multiple genetic variants. Epidemiol Camb Mass. 2017;28(1):30–42.

Jansen IE, Savage JE, Watanabe K, Bryois J, Williams DM, Steinberg S, et al. Genome-wide meta-analysis identifies new loci and functional pathways influencing alzheimer’s disease risk. Nat Genet. 2019;51(3):404–13.

Jackson D, White IR, Riley RD. Quantifying the impact of between-study heterogeneity in multivariate meta-analyses. Stat Med. 2012;31(29):3805–20.

Santiago JA, Quinn JP, Potashkin JA. Sex-specific transcriptional rewiring in the brain of alzheimer’s disease patients. Front Aging Neurosci. 2022;14:1009368.

Lanni C, Masi M, Racchi M, Govoni S. Cancer and Alzheimer’s disease inverse relationship: an age-associated diverging derailment of shared pathways. Mol Psychiatry. 2021;26(1):280–95.

de Mezer M, Rogaliński J, Przewoźny S, Chojnicki M, Niepolski L, Sobieska M, et al. SERPINA3: stimulator or inhibitor of pathological changes. Biomedicines. 2023;11(1):156.

Kamboh MI, Minster RL, Kenney M, Ozturk A, Desai PP, Kammerer CM, et al. Alpha-1-antichymotrypsin (ACT or SERPINA3) polymorphism may affect age-at-onset and disease duration of Alzheimer’s disease. Neurobiol Aging. 2006;27(10):1435–9.

Han Y, Jia J, Jia XF, Qin W, Wang S. Combination of plasma biomarkers and clinical data for the detection of sporadic Alzheimer’s disease. Neurosci Lett. 2012;516(2):232–6.

Wang Y, Sun Y, Wang Y, Jia S, Qiao Y, Zhou Z, et al. Identification of novel diagnostic panel for mild cognitive impairment and Alzheimer’s disease: findings based on urine proteomics and machine learning. Alzheimers Res Ther. 2023;15(1):191.

Rosner M, Fuchs C, Siegel N, Valli A, Hengstschläger M. New insights into the role of the tuberous sclerosis genes in leukemia. Leuk Res. 2009;33(7):883–5.

Abou-Elhamd AS, Kalamegam G, Ahmed F, Assidi M, Alrefaei AF, Pushparaj PN, et al. Unraveling the catha edulis extract effects on the cellular and molecular signaling in SKOV3 cells. Front Pharmacol. 2021;12:666885.

Pollio G, Hoozemans JJM, Andersen CA, Roncarati R, Rosi MC, van Haastert ES, et al. Increased expression of the oligopeptidase THOP1 is a neuroprotective response to abeta toxicity. Neurobiol Dis. 2008;31(1):145–58.

Shi Y, Liu H, Yang C, Xu K, Cai Y, Wang Z, et al. Transcriptomic analyses for identification and prioritization of genes associated with alzheimer’s disease in humans. Front Bioeng Biotechnol. 2020;8:31.

Cho-Clark MJ, Sukumar G, Vidal NM, Raiciulescu S, Oyola MG, Olsen C, et al. Comparative transcriptome analysis between patient and endometrial cancer cell lines to determine common signaling pathways and markers linked to cancer progression. Oncotarget. 2021;12(26):2500–13.

Falcaro M, Castañon A, Ndlela B, Checchi M, Soldan K, Lopez-Bernal J, et al. The effects of the national HPV vaccination programme in England, UK, on cervical cancer and grade 3 cervical intraepithelial neoplasia incidence: a register-based observational study. Lancet Lond Engl. 2021;398(10316):2084–92.

McCredie MRE, Sharples KJ, Paul C, Baranyai J, Medley G, Jones RW, et al. Natural history of cervical neoplasia and risk of invasive cancer in women with cervical intraepithelial neoplasia 3: a retrospective cohort study. Lancet Oncol. 2008;9(5):425–34.

Nishio M, To Y, Maehama T, Aono Y, Otani J, Hikasa H, et al. Endogenous YAP1 activation drives immediate onset of cervical carcinoma in situ in mice. Cancer Sci. 2020;111(10):3576–87.

Talwar P, Gupta R, Kushwaha S, Agarwal R, Saso L, Kukreti S, et al. Viral induced oxidative and inflammatory response in alzheimer’s disease pathogenesis with identification of potential drug candidates: a systematic review using systems biology approach. Curr Neuropharmacol. 2019;17(4):352–65.

Block J. Alzheimer’s disease might depend on enabling pathogens which do not necessarily cross the blood-brain barrier. Med Hypotheses. 2019;125:129–36.

Tejeda M, Farrell J, Zhu C, Wetzler L, Lunetta KL, Bush WS, et al. DNA from multiple viral species is associated with alzheimer’s disease risk. Alzheimers Dement. 2023;20(1):253–65.

Stürner E, Behl C. The role of the multifunctional BAG3 protein in cellular protein quality control and in disease. Front Mol Neurosci. 2017;10:177.

Suzuki M, Iwasaki M, Sugio A, Hishiya A, Tanaka R, Endo T, et al. BAG3 (BCL2-associated athanogene 3) interacts with MMP-2 to positively regulate invasion by ovarian carcinoma cells. Cancer Lett. 2011;303(1):65–71.

Yan J, Liu C, Jiang JY, Liu H, Li C, Li XY, et al. BAG3 promotes proliferation of ovarian cancer cells via post-transcriptional regulation of Skp2 expression. Biochim Biophys Acta Mol Cell Res. 2017;1864(10):1668–78.

Qu F, Wang X. microRNA-340 induces apoptosis by downregulation of BAG3 in ovarian cancer SKOV3 cells. Pharm. 2017;72(8):482–6.

Sheehan PW, Nadarajah CJ, Kanan MF, Patterson JN, Novotny B, Lawrence JH, et al. An astrocyte BMAL1-BAG3 axis protects against alpha-synuclein and tau pathology. Neuron. 2023;111(15):2383–e23987.

Lei Z, Brizzee C, Johnson GVW. BAG3 facilitates the clearance of endogenous tau in primary neurons. Neurobiol Aging. 2015;36(1):241–8.

Sherzai AZ, Parasram M, Haider JM, Sherzai D. Alzheimer Disease and Cancer: A National Inpatient Sample Analysis. Alzheimer Dis Assoc Disord. 2020;34(2):122–7.

Uldbjerg CS, Wilson LF, Koch T, Christensen J, Dehlendorff C, Priskorn L, et al. Oophorectomy and rate of dementia: a prospective cohort study. Menopause N Y N. 2022;29(5):514–22.

Tang MX, Jacobs D, Stern Y, Marder K, Schofield P, Gurland B, et al. Effect of oestrogen during menopause on risk and age at onset of Alzheimer’s disease. Lancet Lond Engl. 1996;348(9025):429–32.

Article   CAS   Google Scholar  

Pike CJ. Sex and the development of Alzheimer’s disease. J Neurosci Res. 2017;95(1–2):671–80.

Honjo H, Kikuchi N, Hosoda T, Kariya K, Kinoshita Y, Iwasa K, et al. Alzheimer’s disease and estrogen. J Steroid Biochem Mol Biol. 2001;76(1–5):227–30.

Global Burden of Disease Cancer Collaboration, Fitzmaurice C, Allen C, Barber RM, Barregard L, Bhutta ZA, et al. Global, Regional, and National Cancer incidence, mortality, years of Life Lost, Years lived with disability, and disability-adjusted life-years for 32 Cancer groups, 1990 to 2015: a systematic analysis for the global burden of Disease Study. JAMA Oncol. 2017;3(4):524–48.

Article   PubMed Central   Google Scholar  

Chen CY, Kung PT, Chiu LT, Tsai WC. Comparison of Cervical Cancer Screening used between individuals with disabilities and individuals without disabilities. Healthc Basel Switz. 2023;11(10):1363.

Google Scholar  

Judson PL, Habermann EB, Baxter NN, Durham SB, Virnig BA. Trends in the incidence of invasive and in situ vulvar carcinoma. Obstet Gynecol. 2006;107(5):1018–22.

Moro F, Magoga G, Pasciuto T, Mascilini F, Moruzzi MC, Fischerova D, et al. Imaging in gynecological disease (13): clinical and ultrasound characteristics of endometrioid ovarian cancer. Ultrasound Obstet Gynecol off J Int Soc Ultrasound Obstet Gynecol. 2018;52(4):535–43.

FUTURE II Study Group. Quadrivalent vaccine against human papillomavirus to prevent high-grade cervical lesions. N Engl J Med. 2007;356(19):1915–27.

Berenson AB, Chang M, Hawk ET, Ramondetta LM, Hoang T. Vulvar cancer incidence in the United States and its relationship to human papillomavirus vaccinations, 2001–2018. Cancer Prev Res Phila Pa. 2022;15(11):777–84.

Jevtic S, Sengar AS, Salter MW, McLaurin J. The role of the immune system in alzheimer disease: etiology and treatment. Ageing Res Rev. 2017;40:84–94.

Beydoun MA, Shaked D, Tajuddin SM, Weiss J, Evans MK, Zonderman AB. Accelerated epigenetic age and cognitive decline among urban-dwelling adults. Neurology. 2020;94(6):e613–25.

Dugué PA, Rebolj M, Garred P, Lynge E. Immunosuppression and risk of cervical cancer. Expert Rev Anticancer Ther. 2013;13(1):29–42.

Hardikar S, Johnson LG, Malkki M, Petersdorf EW, Galloway DA, Schwartz SM, et al. A population-based case-control study of genetic variation in cytokine genes associated with risk of cervical and vulvar cancers. Gynecol Oncol. 2015;139(1):90–6.

Vanajothi R, Srikanth N, Vijayakumar R, Palanisamy M, Bhavaniramya S, Premkumar K. HPV-mediated cervical cancer: a systematic review on immunological basis, molecular biology, and immune evasion mechanisms. Curr Drug Targets. 2022;23(8):782–801.

Yang H, Oh CK, Amal H, Wishnok JS, Lewis S, Schahrer E, et al. Mechanistic insight into female predominance in alzheimer’s disease based on aberrant protein S-nitrosylation of C3. Sci Adv. 2022;8(50):eade0764.

Scheyer O, Rahman A, Hristov H, Berkowitz C, Isaacson RS, Brinton RD, et al. Female sex and alzheimer’s risk: the menopause connection. J Prev Alzheimers Dis. 2018;5(4):225–30.

CAS   PubMed   PubMed Central   Google Scholar  

Ness RB. Endometriosis and ovarian cancer: thoughts on shared pathophysiology. Am J Obstet Gynecol. 2003;189(1):280–94.

Chen S, Li Y, Qian L, Deng S, Liu L, Xiao W, et al. A review of the clinical characteristics and novel molecular subtypes of endometrioid ovarian cancer. Front Oncol. 2021;11:668151.

Li Y, Ma R, Hao X. Therapeutic role of PTEN in tissue regeneration for management of neurological disorders: stem cell behaviors to an in-depth review. Cell Death Dis. 2024;15(4):268.

Griffin RJ, Moloney A, Kelliher M, Johnston JA, Ravid R, Dockery P, et al. Activation of akt/PKB, increased phosphorylation of akt substrates and loss and altered distribution of akt and PTEN are features of alzheimer’s disease pathology. J Neurochem. 2005;93(1):105–17.

Martins FC, Couturier DL, Paterson A, Karnezis AN, Chow C, Nazeran TM, et al. Clinical and pathological associations of PTEN expression in ovarian cancer: a multicentre study from the ovarian tumour tissue analysis consortium. Br J Cancer. 2020;123(5):793–802.

Pejovic T, Cathcart AM, Alwaqfi R, Brooks MN, Kelsall R, Nezhat FR. Genetic links between endometriosis and endometriosis-associated ovarian cancer-a narrative review (endometriosis-associated cancer). Life Basel Switz. 2024;14(6):704.

de Nonneville A, Kalbacher E, Cannone F, Guille A, Adelaïde J, Finetti P et al. Endometrioid ovarian carcinoma landscape: pathological and molecular characterization. Mol Oncol. 2024.

Davey Smith G, Hemani G. Mendelian randomization: genetic anchors for causal inference in epidemiological studies. Hum Mol Genet. 2014;23(R1):R89–98.

Download references

Acknowledgements

We are grateful for the public data on Alzheimer’s disease and gynecological cancers provided by the Finnish Biobank and the European Alzheimer’s Disease Biobank.

The study was funded by the Qihuang scholar in the National Support Program for Leading Talents of Traditional Chinese Medicine ([2018] No. 12), the Key Research and Development Project of the Jilin Provincial Department of Science and Technology (Grant No. 20220203153SF) and the Standardization Project of Traditional Chinese Medicine Management in Jilin Province (Grant zybz-2023-027).

Author information

Di Cao and Shaobo Zhang contributed equally to this work and should be considered co-first authors.

Authors and Affiliations

Hubei University of Chinese Medicine, Wuhan, Hubei, 430065, China

Di Cao, Yini Zhang & Ping Wang

Engineering Research Center of TCM Protection Technology and New Product Development for the Elderly Brain Health, Ministry of Education, Wuhan, Hubei, 430065, China

Changchun University of Chinese Medicine, Changchun, Jilin, 130000, China

Shaobo Zhang

Department of Environmental Genomics, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing, 210000, China

Department of Genetic Toxicology, The Key Laboratory of Modern Toxicology of Ministry of Education, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, 210000, China

The Second Affiliated Hospital of Changchun University of Chinese Medicine, Changchun Hospital of Chinese Medicine, Changchun, Jilin, 130000, China

Qiguang Yang

You can also search for this author in PubMed   Google Scholar

Contributions

DC and SZ conceived and designed the study. DC, YZ and MS performed the MR analyses. SZ and PW aided in data analyses. QY and PW assisted in interpreting results and writing the manuscript. All authors revised and approved the final manuscript.

Corresponding author

Correspondence to Ping Wang .

Ethics declarations

Ethical approval.

The analyses were based on publicly available data approved by relevant review boards. All studies contributing data to these analyses had the appropriate institutional review board approval from each country under the Declaration of Helsinki, and all participants provided informed consent.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Supplementary material 2, supplementary material 3, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ .

Reprints and permissions

About this article

Cite this article.

Cao, D., Zhang, S., Zhang, Y. et al. Association between gynecologic cancer and Alzheimer’s disease: a bidirectional mendelian randomization study. BMC Cancer 24 , 1032 (2024). https://doi.org/10.1186/s12885-024-12787-5

Download citation

Received : 02 January 2024

Accepted : 08 August 2024

Published : 21 August 2024

DOI : https://doi.org/10.1186/s12885-024-12787-5

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Gynecological cancer
  • Causal relationship

ISSN: 1471-2407

data analysis in experimental research

IMAGES

  1. PPT

    data analysis in experimental research

  2. Analyzing Experimental Data

    data analysis in experimental research

  3. PPT

    data analysis in experimental research

  4. Plan of action for experimental study and data analysis

    data analysis in experimental research

  5. Workflow of the analysis. Main steps of the experimental data

    data analysis in experimental research

  6. Unleashing Insights: Mastering the Art of Research and Data Analysis

    data analysis in experimental research

VIDEO

  1. EXPERIMENTAL DATA ANALYSIS ( GROUP 15)

  2. BSN

  3. Exploratory Data Analysis Overview

  4. Lakeview software for easy analysis of C-Trap data

  5. BSN

  6. Boxplots: The Deceptive Data Display (Here's Why They Can Mislead Your Audience)

COMMENTS

  1. A Guide to Analyzing Experimental Data

    The data we will use in this tutorial are generated with Qualtrics, a popular website used for designing questionnaires and experimental surveys. We developed an experimental survey based on the flow we described earlier. Then, we generated 500 automated ("test") responses for the purpose of our analysis.

  2. PDF Chapter 10. Experimental Design: Statistical Analysis of Data Purpose

    Learn the basic principles of statistics for experimental research, including descriptive and inferential statistics, measures of central tendency and variability, and statistical tests. This chapter covers the purpose, role, and types of statistical analysis, as well as how to use software and tables.

  3. Data Analysis in Research: Types & Methods

    Definition of research in data analysis: According to LeCompte and Schensul, research data analysis is a process used by researchers to reduce data to a story and interpret it to derive insights. The data analysis process helps reduce a large chunk of data into smaller fragments, which makes sense. Three essential things occur during the data ...

  4. Focus: Study Design & Statistical Analysis: Statistical relevance

    As part of a new EMBO Journal statistics series, this commentary introduces key concepts in statistical analysis and discusses best practices in study design. Statistical analysis is an important tool in experimental research and is essential for the reliable interpretation of experimental results. It is essential that statistical design should ...

  5. Study/Experimental/Research Design: Much More Than Statistics

    Study, experimental, or research design is the backbone of good research. It directs the experiment by orchestrating data collection, defines the statistical analysis of the resultant data, and guides the interpretation of the results. When properly described in the written report of the experiment, it serves as a road map to readers, 1 helping ...

  6. PDF Analysis for Biologists Experimental Design and Data

    the reader to think critically about data by providing important details on design, summary statistics, power analysis/effect size, model t, and data visualization.This book will be used in my classes for years to come. Professor Greg Moyer, Mans eld University I have been using Experimental Design and Data Analysis for teaching and research ...

  7. Experimental Design

    Experimental Design. Experimental design is a process of planning and conducting scientific experiments to investigate a hypothesis or research question. It involves carefully designing an experiment that can test the hypothesis, and controlling for other variables that may influence the results. Experimental design typically includes ...

  8. PDF A Student's Guide to Data and Error Analysis

    Preface. This book is written as a guide for the presentation of experimental including a consistent treatment of experimental errors and inaccuracies. is meant for experimentalists in physics, astronomy, chemistry, life and engineering. However, it can be equally useful for theoreticians produce simulation data: they are often confronted with ...

  9. Effective Experiment Design and Data Analysis in Transportation Research

    10 Examples of Effective Experiment Design and Data Analysis in Transportation Research About this Chapter This chapter provides a wide variety of examples of research questions. The examples demon- strate varying levels of detail with regard to experiment designs and the statistical analyses required.

  10. 8 Types of Data Analysis

    Exploratory analysis. Inferential analysis. Predictive analysis. Causal analysis. Mechanistic analysis. Prescriptive analysis. With its multiple facets, methodologies and techniques, data analysis is used in a variety of fields, including energy, healthcare and marketing, among others. As businesses thrive under the influence of technological ...

  11. Data Analysis: Types, Methods & Techniques (a Complete List)

    Quantitative data analysis then splits into mathematical analysis and artificial intelligence (AI) analysis. Mathematical types then branch into descriptive, diagnostic, predictive, and prescriptive. Methods falling under mathematical analysis include clustering, classification, forecasting, and optimization.

  12. Experimental Design for ANOVA

    Experimental Design for ANOVA. There is a close relationship between experimental design and statistical analysis. The way that an experiment is designed determines the types of analyses that can be appropriately conducted. In this lesson, we review aspects of experimental design that a researcher must understand in order to properly interpret experimental data with analysis of variance.

  13. Experimental Data Analysis

    Experimental Data Analysis. Argonne maintains a wide-ranging science and technology portfolio that seeks to address complex challenges in interdisciplinary and innovative ways. Below is a list of all articles, highlights, profiles, projects, and organizations related specifically to experimental data analysis.

  14. Statistics for Analysis of Experimental Data

    Statistics is a mathematical tool for quantitativ e analysis of data, and as such it serves as the. means by which we extract useful information from data. In this chapter we are concerned with ...

  15. 5 Free Resources for Learning Experimental Design in Statistics

    It covers various statistical methods and experimental designs, offering both theoretical and practical perspectives. The book is accessible to readers with a basic understanding of statistics and is ideal for those looking to apply these concepts in research settings. 3. Experimental Design and Analysis (Carnegie Mellon University)

  16. Data Analytics and Visualization

    This open and free introductory statistics textbook covers topics typical for a college-level non-math majors statistics course. Topics include distributions, probability, research design, estimation, hypothesis testing, power and effect size, comparison of means, regression, analysis of variance (ANOVA), transformations, chi square, and non-parametric (distribution-free) tests).

  17. (PDF) Chapter 3 Research Design and Methodology

    Research Design and Methodology. Chapter 3 consists of three parts: (1) Purpose of the. study and research design, (2) Methods, and (3) Statistical. Data analysis procedure. Part one, Purpose of ...

  18. Experimental data analysis: A guide to the selection of simple

    An example of data analysis using the aforementioned methodologies is illustrated and conclusions drawn, based on statistical analysis of the data, as to the implications of the specific test conditions. The authors intend the paper to provide the basis for improved experimental design incorpoerating statistical analysis of the final data set(s).

  19. Experimental Design and Data Analysis (MAST10011)

    Overview. This subject provides an understanding of the fundamental concepts of probability and statistics required for experimental design and data analysis in the health sciences. Initially the subject introduces common study designs, random sampling and randomised trials as well as numerical and visual methods of summarising data.

  20. Quasi-Experimental Research Design

    Quasi-experimental design is a research method that seeks to evaluate the causal relationships between variables, but without the full control over the independent variable (s) that is available in a true experimental design. In a quasi-experimental design, the researcher uses an existing group of participants that is not randomly assigned to ...

  21. Experimental Research: Meaning And Examples Of Experimental ...

    Experimental research is widely implemented in education, psychology, social sciences and physical sciences. Experimental research is based on observation, calculation, comparison and logic. Researchers collect quantitative data and perform statistical analyses of two sets of variables. This method collects necessary data to focus on facts and ...

  22. Long-Term Selection Cutting Study

    Since 1952, scientists have been collecting data from the Cutting Methods Study on the Argonne Experimental Forest in Wisconsin. The study compares how different forest management strategies affect the growth of desirable trees. The original goal was to figure out how to manage trees to get the greatest number of large, healthy trees for timber and other wood products.Today, NRS research ...

  23. Comparison of the SBAR method and modified handover model on handover

    This research was designed as a semi-experimental study, with census survey method used for sampling. In order to collect data, Nurse Perception of Hanover Questionnaire (NPHQ) and Handover Quality Rating Tool (HQRT) were used after translating and confirming validity and reliability used to direct/collect data.

  24. High-velocity compressible gas flow modelling through ...

    In this part, the developed methodologies are applied to various experimental data sets. In Section 2.1, previously published experimental data sets are used, while Section 2.2 consists of the experimental investigations carried out by the Heriot-Watt University Gas Condensate Recovery (HWU-GCR) research team. These tests have carefully been ...

  25. Toward a foundation model of causal cell and tissue biology with a

    This Review describes experimental platforms for high-throughput, multi-modal perturbation screens and associated computational methods for interpreting data and predicting outcomes. The authors issue a call to build a Perturbation Cell Atlas to help the research community better understand causal cell and tissue biology.

  26. 30 Years of Experimental Education Research in the Post-Soviet Space: A

    This is supplementary material to the article "30 Years of Experimental Education Research in the Post-Soviet Space: A Meta-Analysis of Interventions". This meta-analysis systematically evaluates the potential of available research in post-Soviet countries as a basis for an evidence-based approach to improving student achievement. The study was conducted on a selection of 41 publications ...

  27. Applied Sciences

    In order to solve this problem, this article proposes an anomaly detection method for battery cells based on Robust Principal Component Analysis (RPCA), taking the historical operation and maintenance data of a large-scale battery pack from an energy storage station as the research subject.

  28. Coupled Heat Transfer and Applications of Machine Learning

    With the continued development of artificial intelligence technology, machine learning has become an important method of predictive analysis. In particular, coupled heat transfer analysis based on machine learning has significant advantages in terms of reducing experimental costs and modeling time. Through machine learning, researchers can rapidly determine the coupled heat transfer processes ...

  29. Numerical and experimental analysis of biomimetic tubercle for

    Numerical and experimental analysis of biomimetic tubercle for cavitation suppression in viscous oil flow around hydrofoil ... Beijing Institute of Technology, Beijing, People's Republic of China;d Advanced Technology Research Institute, Beijing Institute of Technology, Jinan, People's Republic ... The data that support the findings of this ...

  30. Association between gynecologic cancer and Alzheimer's disease: a

    Study design. To explore the causal relationship between gynecologic cancer and AD, the present study conducted a bidirectional MR analysis. Figure 1 illustrates our study's methodology and process. The selected genetic IVs need to satisfy three assumptions of MR analysis [].First, genetic variations are assumed to be closely related to the exposure event.