Electronic Journal of Science Education V4 N2 December 1999 Dass
Evaluation of a District-wide Inservice Professional Development Program for Teaching Science:
Challenges Faced and Lessons Learned

Pradeep Maxwell Dass
Northeastern Illinois University

Reform of Professional Development: The Iowa Chautauqua Model

According to Sykes (1996), "What lends urgency to professional development is its connection to reform and to the ambitious new goals for education that are to be extended to all students." The connection of professional development of science teachers to reform of science education is becoming increasingly realized. The centrality of the role of professional development in bringing about science education reform is evident in several recent actions. For instance, the National Science Education Standards (National Research Council, 1996) include a separate section devoted to professional development standards and the Goals 2000: Educate America Act, Title I, lists professional development as one of the eight National Education Goals.

While the importance of professional development in bringing about science education reform has become increasingly obvious, the limitations of traditional forms of professional development in contributing to reform have also been recognized. The traditional 'one-shot' approach to professional development has recently come under attack as being inadequate and inappropriate in the context of current educational reform efforts, and as being out of step with current research about teacher learning (Darling-Hammond & McLaughlin, 1995; Fullan, 1995; Kyle, 1995; Lieberman, 1995; Lieberman & Miller, 1992; Little, 1993; Miles, 1995). A new perspective on professional development of teachers has become a crucial first step in the reform process. Lieberman (1995, p. 592) notes, "The conventional view of staff development as a transferable package of knowledge to be distributed to teachers in bite-sized pieces needs radical rethinking." However, it is also recognized that there is a dearth of information about the factors that contribute to effective professional development as well as examples of programs and efforts which point to effective practice (Kyle, 1995; Sparks &Loucks-Horsley, 1990).

Within the last decade, the Iowa Chautauqua Program (ICP) emerged as an exemplary model of professional development of science teachers. The Chautauqua model is different from traditional forms of professional development. It supports teachers through an entire academic year and expects a commitment from teachers to practice in their classrooms the instructional innovations promoted by the program and to evaluate their effectiveness in the context of their own teaching situations. These evaluations become the focus of discussions during the academic year series of workshops. The discussions are intended to help refine and improve instructional approaches to better match one's teaching situation.

With endorsement from the U. S. Department of Education in 1993, the ICP was disseminated across the nation, through the Department's National Diffusion Network (NDN). As a result, several new professional development programs emulating the Iowa Chautauqua model were initiated in diverse settings. Due to its higher expectations, successful implementation of the Chautauqua model depends upon a variety of factors, including concerns of those who would be involved in implementing the program. A study of the development and implementation of these new programs can yield needed and timely information about factors which influence the implementation of comprehensive professional development programs. To fulfill this need, a formative evaluation of the development and implementation of the Collier Chautauqua Program (CCP, Figure 1) was conducted in Collier County, Florida, during 1995-97.

Figure 1

The Collier Chautauqua Program

Summer Leadership Institute

A four-day institute for lead teachers designed to prepare them for leadership roles for the summer and academic year series of workshops.

Summer Training Institute

A three-week institute for participating teachers during which they experience new instructional strategies as students.

Academic Year Series of Workshops

Three-day workshops during Fall and Spring semesters to evaluate teaching trials of the modules designed during the summer institute, make further plans and refinements, and develop appropriate assessment schemes.

Interim Communication

Monthly meetings, electronic communication, and site-based meetings to share information, assess progress, and provide support and encouragement to peers continually.

Interim Teaching Projects

Teaching trials of modules developed during the summer institute and incorporation of new teaching strategies into the entire curriculum.

The CCP was implemented on the district level involving elementary and middle school science teachers during the first two years of implementation. Participation in the program was voluntary. The CCP emulated the principal features of the ICP. The formative evaluation concentrated on major issues related to the implementation the Chautauqua model in a large district setting, classroom implementation of instructional innovations by participating teachers, and teacher enhancement in multiple domains resulting from participation in the program. The results and specific findings of this evaluation have been reported elsewhere (Dass, 1997, 1998, 1999). This article focuses on the evaluation process itself. It is a reflective analysis of the effectiveness of the evaluation process in terms of the strengths and weaknesses of the approaches used and challenges faced in the implementation of evaluation activities. Based on these reflections, the article concludes with specific recommendations for improving the evaluation process.

Evaluation of Professional Development: A Timely Need

Factors contributing to effective inservice education began to be recognized as early as 1978 (Berman & McLaughlin, 1978) and new guidelines for effective inservice programs began to emerge in the 1980s (e.g., Evans, 1986). However, as Liu (1992) reported, a majority of the inservice programs did not take into consideration these factors and guidelines in the design of activities. The primary reason for this was the lack of emphasis on evaluation. Evaluation has been the most neglected part of education, particularly inservice education (Evans, 1986). In the 1980s, Kyle &Sedotti (1986, p. 101) observed, "The evaluation of staff development is not a well developed area. The paucity of research associated with staff development suggests that systematic evaluation of staff development is the exception rather than the rule." During the 1990s, Blunck (1993, p. 22) has noted that evaluation of inservice is neither mentioned directly nor referred to indirectly in The Handbook of Research on Teacher Education.

The scarce evaluation activities that took place during the last four decades have been poorly conceived and narrowly defined. The earliest ones, during the 1950s and 1960s, have been "objectives-based" and focused on "goal attainment" (Blunck, 1993). The ones during the 1970s focused mostly on teacher attitudes regarding inservice practices (Ainsworth, 1976; Brim &Tollett, 1974; Zigarmi, Betz, &Jensen, 1977); however, some began considering factors related to effective inservice programs (e.g., Berman &McLaughlin, 1978). Within the spirit of the "training" paradigm of inservice education, most of the evaluation activities were conducted as discrete events. They usually occurred after the completion of the activity and were backward-looking. They were rarely tied to change and improvement in professional practice or to planning better programs (Blunck, 1993; Guskey, 1995).

Along with recent changes in perspectives on staff development (shifting emphases from a view of ‘inservice education’ to a view of ‘professional development’, e.g., Renyi, 1996; National Research Council, 1996), perspectives on evaluation have also been changing. Evaluation began to be viewed more as a process rather than an event (Kyle &Sedotti, 1986; Verma, 1984). It is increasingly viewed as a way of improving as well as proving the worth of a program rather than merely providing information on the attainment of goals or objectives. The concept of formative evaluation evolved out of the notion that an important purpose of evaluation is to help improve the program on a continual basis rather than wait until the end to find out whether or not it was effective. Holly andWalley (1989) view evaluation as a "vehicle for professional development" (p. 290) and describe it as:

      ...an ongoing process that informs practice and contributes to 'the quality of provision' from multiple perspectives. Defining appropriate provisions, methods and scheduling for formative and summative evaluations, where there are opportunities to integrate and discuss self-evaluations and the evaluations of others, is the scaffolding for professional, staff and curriculum development (p. 293).
Traditionally, program evaluation has focused on teacher implementation of or compliance with specific goals or improvement in teacher knowledge and skills. Current thinking, however, considers evaluation as a vehicle to explore the effect of the program on individuals in terms of how they influence institutional culture and the interaction between the program and the institutional context within which the program is implemented (Fullan, 1993). Evaluations conducted along these lines will be far more powerful than the traditional ones, in providing information regarding improvement of a program specific to the particular context in which it is implemented. Consequently, the effectiveness of the program can be enhanced and desirable results achieved to a much greater extent.

Several recent actions attest to the changing perspectives regarding program evaluation and recognition of its value in contributing to program effectiveness for the desired reform. Some examples include publication of the National Science Foundation (NSF) monograph on evaluation (Frechtling, 1995), increasing number of research studies on evaluation of professional development activities (Joyce &Showers, 1980; National Center for Improving Science Education, 1994; Sparks, 1983; Wood &Thompson, 1980), and a recent (May 1997) electronic discussion on "evaluation of teacher enhancement efforts" initiated by Teacher Enhancement Electronic Communications Hall (TEECH). As perspectives on professional development are changing and it is being viewed as critical in leading educational reform, the value of both formative and summative program evaluation is increasingly recognized. These evaluations are a critical component of professional development reform. Extensive evaluation schemes to document effective practices are a critical need of our time (Sparks &Loucks-Horsley, 1990).

Evaluation of the Collier Chautauqua Program: A Formative Approach

Conceptual Framework

The term evaluation has been used to mean a variety of things to different people in different contexts and, over time, several definitions of evaluation have emerged. However, a comprehensive definition was presented by the Joint Committee on Standards for Educational Evaluation (1981):

Systematic investigation of the worth or merit of an object...

According to Stevens (1993), this definition focuses on the function of evaluation for a purpose. One such purpose, which forms the basis for a formative evaluation, is to "improve" rather than "prove" the worth of a program (Stufflebeam et al, 1971). In describing the history of the development of evaluation studies and their function, Isaac and Michael (1981, p. 2) state:

      Evaluation, on the other hand, has come the way of technology rather than science. Its accent is not on theory building but on product delivery or mission accomplishment. Its essence is to provide feedback leading to a successful outcome defined in practical, concrete terms.
In the case of formative evaluation, the successful outcome can be defined as continual improvement and refinement of the program. The analogy used by evaluation theorist Robert Stake to distinguish between formative and summative evaluation is particularly useful in understanding this role of formative evaluation:
      When the cook tastes the soup, that’s formative; when the guests taste the soup, that’s summative.
Just as tasting by the cook is an integral part of the cooking process, evaluation activities need to be an integral part of program processes and results communicated regularly to stakeholders. The same view of formative evaluation is formally described by Stevens (1993):
      A formative evaluation assesses ongoing project activities with the intent to provide information to improve the project.
This concept of formative evaluation sets the foundation on which rests a four component model of evaluation proposed by Asche and Hammons (1995). The model was originally developed by Stufflebeam. His four components included context, input, process, and product. Asche and Hammons modified the four components to inputs, processes, products, and outcomes. These components are defined as follows.
      INPUTS: The raw materials of the program such as facilities, supplies, tools, machines, faculty characteristics, etc.

      PROCESSES: Activities conducted to accomplish program goals. These may include teaching methods, curriculum, scheduling, etc.

      PRODUCTS: Immediate, measurable outcomes of program activities, for instance, teacher enhancement in specific domains.

      OUTCOMES: Long term outcomes of program activities, such as development of leadership qualities in participating teachers, which can be documented over extended period of time beyond the life of the program.

Since the Asche and Hammons model was being used by Collier County Public Schools in all their program evaluations at the time, the evaluation of the CCP also used this model as a conceptual framework. Working collaboratively, the author and Collier County School District administrators identified specific elements of each of the four components for CCP (Table 1) and developed an evaluation plan (Table 2). The intent of the evaluation was to generate information which could be used by Collier County for improving the implementation of the Iowa Chautauqua Model in the school district. The evaluation was conducted during the first two years of the CCP. The findings from the first year lead to program modifications during the second year and the second year evaluation influenced the direction of future program implementation.

Table 1

Elements of the CCP within the framework of the FOUR COMPONENT MODEL





Collier County teachers

A variety of local resource people including a professor from the University of South Florida

Resource personnel from the Iowa Chautauqua Program

Leadership Training Week

Summer Training Workshop

Fall & Spring Follow-up Workshops

Interim teaching projects

Ongoing communication and feedback through monthly meetings, e-mail, and Chau-Talk Conference on FIRN

Teacher Products:

Development of leadership qualities

Growth in constructivist teaching practices

Change in attitude toward teaching

Increase in competency and confidence regarding science content

Increase in collaboration between teachers as school-based teams

Increase in the use of technology in instruction

Student Products:

Attitude toward learning in general and the study of science in particular

Understanding of science concepts

Understanding of the nature of science

Ability to work in collaborative teams

Teacher Outcomes:

Continued leadership and academic development activities beyond the duration of participation in the program

Increase in ability to integrate science with mathematics and language arts

Model for instructional philosophy

Student Outcomes:

Continued study of science subjects

College majors in fields of science

Science related career choices

Community Outcomes:

Increased collaboration between the community at large and the education community


Table 2

Management Plan for Formative Evaluation
Evaluation Questions
Data Elements


Standards of Success


 Data Sources, Methods and Responsibility


Time of Data Collection




To determine if the program served the population specified in the grant proposal  Did the program inputs get carried out as specified in the CCP grant proposal? Inputs:

Collier County teachers of grades K-8

A variety of local resource people including a professor from the University of South Florida

 Resource personnel from the Iowa Chautauqua Program (ICP)

 Develop school-based teams of 5-10 teachers

Involve every local post-secondary institution and the major informal learning centers, research, and health centers

Bring ICP staff to workshops and maintain communication

Workshop Observations, interviews of selected input personnel, and conferences with key organizers.by Internal Independent Auditor Summer Training Workshop, Fall & Spring Follow-up Workshops, and interim communication  Summarize and share with Collier County School District Administration in the form of formal written reports
  To determine if the methods employed by the program were consistent with those specified in the grant proposal  Did the processes get carried out as specified in the CCP grant proposal? Processes:

Leadership Training Week

Summer Training Workshop

Fall & Spring Follow-up Workshops

Interim teaching projects

Ongoing communication and feedback through monthly meetings, e-mail, and Chau-Talk Conference on FIRN

At least 85% teachers attending training and follow-up workshops are members of school-based teams

At least 85% teachers attending workshops maintain communication and provide feedback through meetings, e-mail, and Chau-Talk Conference Group

Observations, and Interviews by Internal Independent Auditor.

Log keeping/journal activity by Collier County School District officials such as the Science Coordinators and Directors of Elementary & Secondary Education

Summer Training Workshop, Fall & Spring Follow-up Workshops, and interim meetings Summarize and share with Collier County School District Administration in the form of formal written reports
 To determine the extent to which teachers' behavior, attitude, and knowledge have been effected What changes were brought about in teachers regarding:

Leadership, Use of Constructivist teaching strategies, Attitude about teaching, Content competenceCollaborative team work, and Use of Technology in instruction

Teacher Products:

Development of leadership qualities

Growth in the use of constructivist teaching practices

Improvement in attitude toward teaching

Increase in competency and confidence regarding science content

Increase in collaborative team work among colleagues in school-based teams

At least 30% teachers participate in planning new projects

At least 95% teachers implement workshop modules in their class

At least 85% teachers show increased positive attitude

At least 90% teachers show increased competency

At least 85% teachers demonstrate collaborative work as evidenced in the realization of the school improvement action plans

Individual interviews, focus-group interviews, observations, and administration of the Teacher Enhancement Assessment Instruments by the independent internal auditor

Journals and portfolios analyzed by district officials—Science Coordinator and Directors of Elementary and Secondary Education

Information regarding school improvement plans to be collected and analyzed by the district Science Coordinator

Summer training workshop, Fall & Spring Follow-up Workshops, and interim meetings Inferential statistical analyses of questionnaires

Qualitative analysis of data from interviews, observations. journals, and portfolios

 To determine the extent to which students' attitude and knowledge have been effected What changes were brought about in students regarding:

Attitude, and Understanding of concepts

Student Products:

Attitude toward learning in general and toward science in particular

Understanding of science concepts

10% improvement in class attitude after the implementation of a single module

10% improvement over the previous year in overall achievement on performance objectives related to curriculum

Attitude assessment tests administered by classroom teachers and analyzed by CCP Assessment team

ESPET in grade 5, portfolios and journals in all other grades, and new survey instruments to be developed by the science coordinator and the assessment team

Attitude assessment test given on the first day of school and at the end of the teaching module

ESPET at the end of grade 5;

portfolios, journals, and other surveys toward the end of each academic year

Inferential statistical analyses of the attitude assessment tests and ESPET

Qualitative analyses of portfolios and journals


The questions addressed in this evaluation can be divided into two categories: Program implementation and teacher enhancement. The following specific questions were addressed.

Program Implementation

1. What factors contributed to successful implementation of the Chautauqua model on a district-wide basis? In what ways do these factors contribute to implementation success?

2. What factors hindered the implementation of the Chautauqua model on a district-wide basis? In what ways do these factors hinder implementation success?

3. What factors influenced teachers’ use of the constructivist and Science-Technology-Society (STS) approaches (promoted by the Iowa Chautauqua model) in their classrooms?

Teacher Enhancement

1. In what ways does participation in the Chautauqua Model of professional development enhance leadership qualities of teachers?

2. In what ways does participation in the Chautauqua Model of professional development enhance teachers' understanding and use of constructivist approaches?

3. In what ways does participation in the Chautauqua Model of professional development enhance teachers' attitudes toward teaching in general and toward teaching science in particular?

4. In what ways does participation in the Chautauqua Model of professional development enhance teachers' confidence to teach science?

5. In what ways does participation in the Chautauqua Model of professional development enhance collaboration among teachers, administrators, parents, and the community at large?

6. In what ways does participation in the Chautauqua Model of professional development enhance teachers' integration of technology in instruction?

Evaluation Design

A combination of quantitative and qualitative approaches was used in this evaluation. Data were gathered during the first two years of program implementation. Data collection involved the following.


Pre-post administration of a Teacher Enhancement Assessment Instrument (TEAI) developed and used extensively by the ICP. This instrument contains survey items with responses on a Likert-type scale. Most items are positive statements representing desired state of affairs in various aspects of science instruction and the responses range from NEVER to ALMOST ALWAYS. The TEAI was given to all participating teachers during both years. Pre-test administration took place within the first few days of the three-week summer workshop and post-test administration occurred during the spring workshop.


1. Interview of the program director at the very start of program development, summer 1994.

2. Individual interviews of lead teachers, program director, and director of middle and high school program services during the first summer training workshop, 1995.

3. Focus group interviews of participating teachers by school teams during the fall workshop, 1995.

4. Individual interviews of school principals, director of middle and high school program services, and assistant superintendent of curriculum and instruction at the end of the first year of implementation, summer 1996.

5. Observations by the author during summer, fall, and spring workshops, 1995-96 and 1996-97.

6. Classroom observations in the classes of a representative sample of teachers.

7. Pre-formatted teacher journals written during summer, fall, and spring workshops, 1995-96 and 1996-97.

8. Regular communication between the investigator and the program director via telephone and e-mail, 1995-97.

9. Written responses from the Iowa instructors (involved in CCP) to specific questions.

A summary of data collection events is provided in Table 3.

Table 3

Summary of Data Collection Events
Data collection events
June 1995 1. Individual interviews of district administrators and lead teachers.

2. Pre-test administration of the TEAI to first-year participants.

3. Workshop observations.

October 1995 1. Focus group interviews of first-year participants in school teams.

2. Workshop observations.

3. Collection of preformatted teacher journals.

March 1996 1. Post-test administration of the TEAI to first-year participants.

2. Workshop observations.

3. Collection of preformatted teacher journals.

June 1996 1. Individual interviews of building principals of first-year participants.

2. Pre-test administration of the TEAI to second-year participants.

3. Workshop observations.

October 1996 1. Written response from Iowa instructional staff.

2. Collection of preformatted teacher journals.

January 1997 1. Second post-test administration of the TEAI to first-year participants.
March 1997 1. Post-test administration of the TEAI to second-year participants.

2. Workshop observations.

3. Collection of preformatted teacher journals.

4. Final interview of the district science coordinator.


Data Analysis


Selected items from the TEAI were grouped to document professional development in the following six domains. Each domain corresponds respectively to questions 1-6 regarding teacher enhancement.

1. Leadership Qualities

2. Use of Constructivist Approaches

3. Attitudes toward Teaching

4. Teaching Confidence

5. Evidence of Collaboration

6. Integration of Technology

The items representing these domains are all positive statements and represent a desirable state of affairs. Therefore, responses to all items were scored according to the following scale: NEVER=0.00; RARELY=1.00; SOMETIMES=2.00; FREQUENTLY=3.00; ALMOST ALWAYS=4.00. Pre- and post-test scores were then analyzed in aggregated data sets, each set representing one of the domains of professional development identified above. The pre- and post-test scores from each group were computed in terms of percentages reporting given levels of confidence, understanding, or practice. Analysis of variance (ANOVA) with repeated measures was used to determine significant changes between pre-test and post-test scores, thus documenting teacher enhancement in specific domains. The difference between pre- and post-test scores was regarded significant at Pd0.10. This relatively higher P value was used in order to decrease ß, which in turn increases the 'power of the test' (Hinkle, Wiersma, & Jurs, 1994).

Scores from year 1 and year 2 were analyzed separately, using the statistical procedures mentioned above, to document teacher enhancement during each of the two years. The results of year 1 and year 2 were then compared with each other to check similarities and differences in areas of enhancement between years 1 and 2. One-way factorial ANOVA using grade-level groups (PreK-2, 3-5, and 6-8) as factor was conducted on Year 2 data to examine the effect of grade level on teacher enhancement. This was done in order to investigate whether or not program activities were equally effective in enhancing teachers at all grade levels represented by the participants. Similar analysis was not conducted for Year 1 data due to relatively small N. A second post-test was given to year 1 participants in January 1997 to examine whether or not enhancement resulting from participation in the CCP lasts beyond the actual year of participation, thus reflecting the capability of the model to contribute to sustainable reform. However, the return rate on this second post-test was too low (only 9 out of 20 or 45%) to conduct any meaningful statistical analyses.


Qualitative data from all sources were analyzed using standard data coding procedures. These involve developing coding categories, sorting data at different levels using the coding categories, and building statements of findings based upon information from various categories (Bogdan &Biklen, 1992). Coding categories were developed on the basis of information available in the data and the areas which the investigator and program director wanted to pursue.

Preliminary analysis of all data was conducted during the period of data collection using the constant comparative method (Glaser &Strauss, 1967; Strauss, 1987) in order to refine data collection processes so that relevant information may be gathered. This involved examination of interview and observation data soon after it was collected to check the extent to which the information collected addressed the evaluation questions. This examination during the early part of data collection helped revise interview questions so that the questions were better suited to elicit information, which was more useful in addressing the research questions.

Formal data analysis was conducted after the completion of data collection at the end of each of the two years. All qualitative data were initially subjected to an open-coding procedure. During this procedure, complete sentences and paragraphs of the transcripts were examined and compared for developing category codes. In accordance with the procedure suggested by Bogdan and Biklen (1992), the development of category codes was based primarily upon the information contained within the data and secondarily on some directions important to be pursued for understanding the subtleties of the school district in which the program was being implemented. The initial open-coding procedure yielded an extensive list of category codes. These were subsequently condensed into larger categories based on the commonalty of themes within the initial codes. Propositions addressing the evaluation questions were ultimately derived from these larger categories.

The open coding procedure was particularly useful in developing substantive theory grounded in empirical data (Glaser &Strauss, 1967; Strauss &Corbin, 1990) regarding questions related to program implementation at the district level. Category codes thus generated were synthesized into propositions regarding factors related to implementation of the Chautauqua model of professional development in Collier County school district. For sorting data related to classroom implementation of the instructional strategies promoted by the program, the stages of concern identified by Hall, Wallace, and Dossett (1973) and Hall (1979) in the concern-based adoption model (CBAM) were used as category codes. The six teacher enhancement categories used in quantitative data analysis were used as category codes to sort data regarding teacher enhancement which could be used to triangulate the quantitative data. Table 4 summarizes the category development procedures using the domains of components of categorization and temporal designation suggested by Constas (1992).

Table 4

Summary of Category Development Procedures
Components of Categorization Temporal Designation
A priori
A posteriori
Origination (authority for category creation)      
Teacher Enhancement
Open Codes
Stages of Concern
Teacher Enhancement
Verification (grounds for justifying categories)      
Open Codes
Teacher Enhancement
Stages of Concern
Open Codes
Nomination (source of category names)      
Teacher Enhancement
Open Codes
Stages of Concern


The Evaluation Process: A Reflective Analysis

For specific findings of this evaluation, readers are referred to Dass (1997, 1998, 1999). The purpose of this article is to focus on the evaluation process itself. Due to the formative nature of this evaluation, there was regular communication between the author and the program director. The results of the first year data analysis were used to inform and guide the planning for second year implementation. The results of the second year data analysis were used to evaluate the effectiveness of changes made during the second year and to inform future implementation of the program. The author's reflective analysis of the evaluation process focuses on the rationale for the use of specific approaches, effectiveness of the approaches and instruments used, and major challenges encountered in conducting the evaluation activities.

Quantitative or Qualitative?

This evaluation employed both quantitative and qualitative methodologies to collect and analyze data. As a formative evaluation, it focused on program implementation issues and teacher enhancements resulting from participation in the program. The quantitative methodology produced information regarding teacher enhancements only, whereas qualitative methodologies produced information regarding both teacher enhancements and program implementation.

The role of both quantitative and qualitative approaches in educational research, particularly science education research, has been discussed extensively in the recent past (Good, 1992; Kyle, Abell, &Roth, 1992; Lederman, 1992; Wandersee &Demastes, 1992; Yarger &Smith, 1990). The general consensus is that, regardless of the approach used (whether quantitative, qualitative, or both), the quality of any investigation is enhanced by the use of appropriate, high-quality warrants (Roberts, 1982) and sound argumentative structure, based upon those warrants, leading to specific conclusions (Roberts, 1996). The decision regarding the type of approach to be used must be directed by the research questions the investigation is intended to address (Lederman, 1992).

By using both approaches in this investigation, it was ensured that relevant information was gathered and analyzed to address questions related to both program implementation and teacher enhancement. Further, the use of both approaches to gather information regarding teacher enhancement provided between-methods triangulation (Denzin, 1978) which increased the credibility of conclusions about teacher enhancement. Similarly, the use of multiple qualitative approaches (interviews, observations, journals, etc.) ensured within-method triangulation and data collection at multiple points in time and from multiple sources ensured data triangulation. The nature of questions related to program implementation did not lend itself to quantitative approaches. Hence, program implementation factors were examined only through qualitative approaches.

Data Collection: Approaches and Instruments

Much of the data, both quantitative and qualitative, during this evaluation were collected through teacher self-reports. These included responses on a 74-item survey instrument with 5-point Likert-type response scale, in-depth interviews, focus groups, and pre-formatted journals. There has been much discussion of the difficulties related to levels of accuracy, validity, and reliability of self-reported data (Chall & Feldmann, 1966; Denzin, 1978; Hook & Rosenshine, 1979; Koziol & Burns, 1985, 1986; Newfield, 1980; Schmidt, 1996; Traub & Weiss, 1982).

Common concerns raised in these discussions of self-reported data include: 1) the effect of the instruments on the setting they were meant to describe; 2) impressions created by self-reports that may not be significant in the minds of the respondents; 3) the effect of individual differences (irrelevant to the explicit topic) on the data obtained; and 4) lack of validity and reliability. However, several researchers have also found high levels of accuracy in self-reported data (Koziol &Burns, 1986; Newfield, 1980; Traub &Weiss, 1982). Koziol and Burns report that focused teacher self-reports can gather reliable data on instructional practices. Newfield found that, under certain conditions, teachers can accurately report their own behaviors. Traub and Weiss contend that teacher self-reported data may be more accurate than is typically believed.

In order to minimize the limitations of self-reported data, researchers recommend the use of triangulation approaches. Denzin (1978) has identified four types of triangulation approaches: 1) Data triangulation (collecting data from multiple sources at multiple points in time); 2) Methodological triangulation (using multiple methods of data collection); 3) Investigator triangulation (using multiple investigators to collect data); and 4) Theory triangulation (use of specific, preferably multiple, theoretical perspectives).

This evaluation utilized all but theory triangulation approaches to varying degrees in order to minimize the limitations of self-reported data. Moreover, a limited amount of first-hand data was also collected through classroom visits (approximately one per teacher) by the author and the Collier County District Science Coordinator. Although limited in amount and scope, this data helped in checking the accuracy of some of the self-reported data provided by teachers and offered a degree of investigator triangulation.

Effectiveness of the TEAI

The effectiveness of the TEAI as an instrument to collect quantitative data was limited by several factors. First of these was the inherent nature of self-reported data. This limitation was overcome in part by triangulating quantitative results with qualitative ones. However, this triangulation cannot be regarded perfect since much of the qualitative data was also of the self-reported kind (interviews and journals).

Second, the Likert-type response scale used in the TEAI imposes limitations in the sense that individuals interpret response categories differently. The response categories used in the TEAI are NEVER, RARELY, SOMETIMES, FREQUENTLY, and ALMOST ALWAYS. These categories were not defined in the instrument. Hence, participants were free to interpret these categories as they wished.

Third, the statistical value of the use of the TEAI was limited by attrition between the pre- and post-tests, particularly in the second year, and by non-response on several items by individuals either on the pre-test or the post-test. For any given individual, the statistical software (SYSTAT) used to analyze TEAI data processed only those items which had a response on both pre- and post-test. Thus, the N varied (by no more than 4) on different items of the TEAI.

Effectiveness of the Qualitative Methods

The effectiveness of the qualitative methods was limited by at least two factors. Here too, the first factor is the self-reported nature of data. Much of the qualitative data were gathered through interviews (either individuals or focus groups) and teacher journals. Both of these yield self-reported data. While classroom observations were conducted to corroborate interview and journal data, the number of classroom visits were too few (approximately one per teacher) to corroborate all the information that emerged from the interview and journal data. The classroom visits were intended to gather evidence of teacher understanding and use of various instructional approaches touted by the program and addressed by teachers in their journals and interviews. Observing one lesson, however, is clearly not enough to provide sufficient evidence. Further, since the visits were pre-planned, it is also possible that teachers "put on a show" for that particular lesson.

Second, there was a certain degree of hesitancy among some teachers to respond, on record, to interview questions in as much detail as would have been useful from a formative evaluation point of view. For instance, during one of the focus groups, few of the teachers requested that the tape recorder be shut off and their comments be kept off the record. Similar hesitancy was also evident in the brevity of comments made by some teachers in response to the journal questions.

Challenges in Conducting Evaluation Activities

Conducting a formative evaluation of a professional development program like the CCP is challenging because program activities take place both during the summer and throughout the academic year. The evaluator must collect data at strategic points in time and communicate findings to appropriate people in a timely manner. The evaluator must cultivate a collegial relationship with participants in order to elicit useful information from them through qualitative techniques, such as the interviews. The following are the major challenges encountered during this evaluation.

1. As a person from ‘out-of-town’, I was not able to observe classrooms of participating teachers on a frequent basis. I observed only five classes and a few more were observed by the program director (District Science Coordinator). Many more classroom observations were needed in order to gather data sufficient for corroborating the information emerging from teacher self-reported data, both quantitative and qualitative.

2. Scheduling time for evaluation activities during the workshops was not always easy, particularly during the Fall and Spring workshops which were of shorter duration than the Summer workshops. Scheduling interviews and focus groups was most challenging and they often ended up occurring concurrently with some other activity, which meant that participants being interviewed had to miss specific activities.

3. Getting teachers to complete written evaluation tasks (such as the TEAI and journals) in a timely manner was a challenge. Often some teachers would not complete the task during the scheduled time and would need to be reminded several times during the workshop to complete and submit the task.

4. Teachers tended to be somewhat apprehensive about evaluation activities. Many felt as if the evaluation tasks were assessing their performance (as opposed to assessing the program’s performance). Consequently, they would either hesitate to provide the information readily or would try to guess the "right answer" (what does he, the evaluator, want to hear?).


Evaluation must be tied to change and improvement in professional practice or to planning better programs if reform is to take place (Guskey, 1995). A consideration of the evaluation process is valuable for those who are or would be involved in developing and implementing professional development programs similar to the one reported here. An analysis and discussion of the evaluation process meets a critical need within the area of professional development. As evaluation processes improve, the programs they inform can also be expected to improve. To that end, the following recommendations emerge from this analysis of the formative evaluation of the CCP. These recommendations can be useful in improving future evaluations of similar programs.

1. Ensure a balanced collection of both teacher self-reported data (such as questionnaires and journals) and evaluator first-hand data (such as classroom observations). This will increase the efficacy of triangulation, which in turn will enhance the validity of claims.

2. When using a Likert-type response scale, provide specific definition for each response category in order to eliminate ambiguity about their interpretation.

3. Emphasize to the respondents, through detailed instructions, the importance of providing a response to every item on any quantitative instruments used. Design the items and response options in such a way that a response is possible by all participants. This will reduce the problems related to statistical significance.

4. Make evaluation an integral part of the program activities and schedule evaluation activities in such a way that they flow naturally with other activities and do not appear to be special events. Also, evaluation activities should not compete with other program activities for participants during the same time slots.

5. Right from the beginning of the program, help teachers understand that "they" are not being "graded". Rather, it is the effectiveness of the program that is being assessed. Emphasize to them that the more honest, candid, and detailed information they provide, the more useful the evaluation will be in improving the program. Consider teachers' view of assessment in designing and refining program evaluation. Eventually, they will benefit more if the evaluation can contribute to program improvement.

When conducted with efficient planning and insights regarding the specific context of the program and the setting, formative evaluation can be a powerful tool for program improvement, consequently leading to reform as envisioned by Guskey (1995).


Ainsworth, A. (1976). Teachers talk about inservice education. Journal of Teacher Education, 27(2), 107-109.

Asche, M. & Hammons, F. (1995). Internal evaluation: Planning guide. Presented at the 2nd Annual Florida Statewide Technical Preparation Evaluation Workshop, Orlando, Florida. June 1995.

Berman, P. & McLaughlin, M.W. (1978). Federal programs supporting educational change VIII: Implementing and sustaining innovations. Santa Monica, CA: Rand Corporation.

Blunck, S.M. (1993). Evaluating the effectiveness of the Iowa Chautauqua Inservice Program: Changing the reculturing practices of teachers. Unpublished doctoral dissertation, The University of Iowa, Iowa City.

Bogdan, R.C., & Biklen, S.K. (1992). Qualitative research for education: An introduction to theory and methods (2nd ed.). Boston: Allyn and Bacon.

Brim, J. & Tollett, D. (1974). How do teachers feel about inservice education? Educational Leadership, 31(6), 521-525.

Chall, J. & Feldmann, S. (1966). First grade reading: An analysis of the interaction of professed methods, teacher implementation, and child background. Reading Teacher, 19, 569-575.

Constas, M.A. (1992). Qualitative analysis as a public event: The documentation of category development procedures. American Educational Research Journal, 29(2), 253-266.

Darling-Hammond, L. & McLaughlin, M.W. (1995). Policies that support professional development in an era of reform. Phi Delta Kappan, 76(8), 597-604.

Dass, P.M. (1997). District-wide professional development of science teachers: Factors influencing the implementation of the Chautauqua Model. Presented at the National Association for Research in Science Teaching Annual Meeting, Oak Brook, Illinois. March 1997.

Dass, P.M. (1998). Professional development of science teachers: Results of using the Iowa Chautauqua Model in Collier County, Florida. Presented at the National Association for Research in Science Teaching Annual Meeting, San Diego, California. April 1998.

Dass, P.M. (1999). Implementation of instructional innovations: Perspectives of inservice teachers. Presented at the Association of Teacher Educators Annual Conference, Chicago, Illinois. February 1999.

Denzin, N.K. (1978). The research act: A theoretical introduction to sociological methods. New York: McGraw-Hill.

Evans, T.P. (1986). Guidelines for effective science teacher inservice education programs: Perspectives from research. In B. Spector (Ed.), A guide to inservice science teacher education: Research into practice: AETS yearbook 1986, pp. 13-55. Columbus, OH: ERIC Clearinghouse for Science, Mathematics, and Environmental Education.

Frechtling, J.A. (Ed.). (1995). Footprints: Strategies for non-traditional program evaluation. Rockville, MD: Westat, Inc.

Fullan, M.G. (1993). Change forces: Probing the depths of educational reform. London, New York: Falmer Press.

Fullan, M.G. (1995). The limits and the potential of professional development. In T.R. Guskey & M. Huberman (Eds.), Professional development in education: New paradigms and practices, pp. 253-267. New York: Teachers College Press.

Glaser, B. & Strauss, A.L. (1967). The discovery of grounded theory: Strategies for qualitative research. Chicago: Aldine.

Good, R. (1992). JRST Welcomes all quality research. Journal of Research in Science Teaching, 29(9), 913.

Guskey, T.R. (1995). Professional development in education: In search of the optimal mix. In T.R. Guskey & M. Huberman (Eds.), Professional development in education: New paradigms and practices, pp. 114-131. New York, NY: Teachers College Press.

Hall, G.E. (1979). The concerns based approach to facilitating change. Educational Horizons, 57(4), 202-208.

Hall, G.E., Wallace, R.C., Jr., & Dossett, W.A. (1973). A developmental conceptualization of the adoption process within educational institutions. Austin, Texas: Research and Development Center for Teacher Education, The University of Texas.

Hinkle, D.E., Wiersma, W., & Jurs, S.G. (1994). Applied statistics for the behavioral sciences, 3rd Edition. Boston, Toronto: Houghton Mifflin Company.

Holly, M.L. & Walley, C. (1989). Teachers as professionals. In M.L. Holly & C.S. Mcloughlin (Eds.), Perspectives on teacher professional development, pp. 285-307. Philadelphia, PA: The Falmer Press.

Hook, C.M. & Rosenshine, B.V. (1979). Accuracy of teacher reports of their classroom behavior. Review of Educational Research, 49, 1-12.

Isaac, S. & Michael, W.B. (1981). Handbook in Research and Evaluation for Education and the Behavioral Sciences, 2nd edition. San Diego, CA: EdITS.

Joint Committee on Standards for Educational Evaluation. (1981). Standards for evaluation of educational programs, projects, and materials. New York, NY: McGraw Hill.

Joyce, B. & Showers, B. (1980). Improving inservice training: The messages of research. Educational Leadership, 37(5), 379-385.

Koziol, S.M. & Burns, P. (1985). Using teacher self-reports for monitoring English instruction. English Education, 17(2), 113-120.

Koziol, S.M. & Burns, P. (1986). Teachers' accuracy in self-reporting about instructional practices using a focused self-report inventory. Journal of Educational Research, 79(4), 205-209.

Kyle, Jr. W.C. (1995). Professional Development: The growth and learning of teachers as professionals over time. Journal of Research in Science Teaching, 32(7), 679-681.

Kyle, Jr. W.C., Abell, S.K., & Roth, W.M. (1992). Toward a mature discipline of science education. Journal of Research in Science Teaching, 29(9), 1015-1018.

Kyle, Jr. W.C. & Sedotti, M.A. (1986). The evaluation of staff development: A process, not an event. In B. Spector (Ed.), A guide to inservice science teacher education: Research into practice: AETS yearbook 1986, pp. 101-118. Columbus, OH: ERIC Clearinghouse for Science, Mathematics, and Environmental Education.

Lederman, N.G. (1992). You can't do it by arithmetic, you have to do it by algebra! Journal of Research in Science Teaching, 29(9), 1011-1014.

Lieberman, A. (1995). Practices that support teacher development: Transforming conceptions of professional learning. Phi Delta Kappan, 76(8), 591-596.

Lieberman, A. & Miller L. (1992). The professional development of teachers. In Myron Atkin (Ed.), The encyclopedia of educational research, 6th edition, vol. 3, pp. 1045-1053. New York: Macmillan Company.

Little, J.W. (1993). Teachers’ professional development in a climate of educational reform. New York: National Center for Restructuring Education, Schools, and Teaching, Teachers College, Columbia University.

Liu, C. (1992). Evaluating the effectiveness of an inservice teacher education program: The Iowa Chautauqua Program. Unpublished doctoral dissertation, The University of Iowa, Iowa City.

Miles, M.B. (1995). Foreword. In T. R. Guskey & M. Huberman (Eds.), Professional development in education: New paradigms and practices, pp. vii-ix. New York: Teachers College Press.

National Center for Improving Science Education. (1994). Evaluation Report: FCCSET/DOE 1993 Summer Institutes. Andover, MA: The Network.

National Research Council. (1996). National science education standards. Washington, DC: National Academy Press.

Newfield, J. (1980). Accuracy of teacher reports: Reports and observations of specific classroom behaviors. Journal of Educational Research, 74(2), 78-82.

Renyi, J. (Ed.). (1996). Teachers take charge of their learning: Transforming professional development for student success. Washington, DC: National Foundation for the Improvement of Education.

Roberts, D.A. (1982). The place of qualitative research in science education. Journal of Research in Science Teaching, 19(4), 277-292.

Roberts, D.A. (1996). What counts as quality in qualitative research? Science Education, 80(3), 243-248.

Schmidt, A.E. (1996). Examining consistency of responses in self-reports using content analysis. Paper presented at the American Educational Research Association Annual Meeting, New York, NY. April 1996.

Sparks, D. & Loucks-Horsley, S. (1990). Models of staff development. In W.R. Houston, M. Haberman, & J Sikula (Eds.), Handbook of research on teacher education, pp. 234-250. New York, NY: Macmillan Publishing Company.

Sparks, G.M. (1983). Synthesis of research on staff development for effective teaching. Educational Leadership, 41(3), 65-72.

Stevens, F. (1993). Evaluation prototypes. In J. Frechtling (Ed.), User-friendly handbook for project evaluation: Science, mathematics, engineering and technology education, pp. 1-13. Arlington, VA: National Science Foundation.

Strauss, A. (1987). Qualitative analysis for social scientists. New York, NY: Cambridge University Press.

Strauss, A. & Corbin, J. (1990). Basics of qualitative research: Grounded theory procedures and techniques. Newbury Park, CA: Sage.

Stufflebeam, D.L., Foley, W.J., Gephart, W.J., Guba, E.G., Hammond, R.L., Merriman, H.O. & Provus, M.M. (1971). Educational Evaluation and Decision Making. Itasca, IL: F. E. Peacock.

Sykes, G. (1996). Reform of and as professional development. Phi Delta Kappan, 77(7), 465-467.

Traub, R.E. & Weiss, J. (1982). The accuracy of teachers' self-reports: Evidence from an observational study of open education. ERIC Document Reproduction Services, No. ED 213 745.

Verma, S. (1984). Staff development: A systematic approach. Education Canada, 25(3), 9-13.

Wandersee, J.H. & Demastes, S. (1992). An analysis of the relative success of qualitative and quantitative manuscripts submitted to the Journal of Research in Science Teaching.Journal of Research in Science Teaching, 29(9), 1005-1010.

Wood, F.H. & Thompson, S.R. (1980). Guidelines for better staff development. Educational Leadership, 37(5), 374-378.

Yarger, S.J. & Smith, P.L. (1990). Issues in research on teacher education. In W. R. Houston, M. Haberman, & J. Sikula (Eds.), Handbook of research on teacher education, pp. 25-41. New York: Macmillan Publishing Company.

Zigarmi, P., Betz, L., & Jensen, D. (1977). Teachers preferences and perceptions of inservice education. Educational Leadership, 34(7), 545-551.

About the author...

P. M. Dass  is currently an assistant professor of science education at Northeastern Illinois University, Chicago.  He holds a Ph. D. in Science Education from the University of Iowa.  His present responsibilities include teaching pre-service secondary science methods courses, supervising science clinical experiences and student teaching, and coordinating the secondary education program.  Prior to coming to Northeastern Illinois University, the author coordinated the National Diffusion Network project of the Iowa Chautauqua Program at the University of Iowa.  As coordinator of this project, the author was responsible for nationwide dissemination of the Iowa Chautauqua model of professional development for in-service science teachers.  The Collier Chautauqua Program highlighted in this article is one of several professional development programs that grew out of an emulation of the Iowa Chautauqua model through the author's dissemination work.
Back to the EJSE