Whole application. But between these two stages of testing, there are others. I, like many others, call such tests integration tests.

A few words about terminology

Having talked a lot with test-driven development enthusiasts, I came to the conclusion that they have a different definition for the term "integration tests". From their point of view, the integration test checks the "external" code, that is, the one that interacts with the "outside world", the world of the application.

So if their code uses Ajax or localStorage or IndexedDB and therefore can't be tested with unit tests, they wrap that functionality in an interface and mock that interface for unit tests, and testing the actual implementation of the interface is called an "integration test". ". From this point of view, an "integration test" simply tests code that interacts with the "real world" outside of those units that work without regard to the real world.

I, like many others, tend to use the term "integration tests" to refer to tests that test the integration of two or more units (modules, classes, etc.). It doesn't matter if you're hiding the real world through shackled interfaces.

My rule of thumb about whether or not to use real implementations of Ajax and other I/O (I/O) operations in integration tests is this: if you can do it and the tests still run fast and don't behave strangely, then check I/O. If the I/O operation is complex, slow, or just weird, then use mock objects in your integration tests.

In our calculator, fortunately, the only real I/O is the DOM. No Ajax calls and no other reason to write "mocks".

Fake DOM

The question arises: is it necessary to write fake DOM in integration tests? Let's apply my rule. Will using real DOM make tests slow? Unfortunately, the answer is "yes": using the real DOM means using the real browser, which makes the tests slow and unpredictable.

Will we separate most of the code from the DOM, or will we test everything together in E2E tests? Both options are not optimal. Fortunately, there is a third solution: jsdom . This wonderful and amazing package does exactly what you expect from it - it implements the DOM in NodeJS.

It works, it's fast, it runs in Node. If you use this tool, you can stop treating the DOM as "I/O". And this is very important, because separating the DOM from the front-end code is difficult, if not impossible. (For example, I don't know how to do this.) My guess is that jsdom was written specifically to run front-end tests under Node.

Let's see how it works. As usual, there is initialization code and there is test code, but this time we will start with test code. But before that, a digression.

Retreat

This part is the only part of the series that focuses on a specific framework. And the framework I chose is React. Not because it's the best framework. I firmly believe that there is no such thing. I don't even think there are better frameworks for specific use cases. The only thing I believe in is that people should use the environment in which they are most comfortable working.

And the framework I'm most comfortable with is React, so the following code is written in it. But as we will see, front-end integration tests using jsdom should work in all modern frameworks.

Let's get back to using jsdom.

Using jsdom

const React = require("react") const e = React.createElement const ReactDom = require("react-dom") const CalculatorApp = require("../../lib/calculator-app") ... describe( "calculator app component", function () ( ... it("should work", function () ( ReactDom.render(e(CalculatorApp), document.getElementById("container")) const displayElement = document.querySelector(" .display") expect(displayElement.textContent).to.equal("0")

The interesting lines are lines 10 to 14. On line 10, we render the CalculatorApp component, which (if you follow the code in the repository) also renders the Display and Keypad components.

We then check that on lines 12 and 14, the element in the DOM shows the initial value of 0 on the calculator's display.

And this code, which runs under Node, uses document ! The document global variable is a browser variable, but here it is in NodeJS. For these lines to work, a very large amount of code is required. This very large amount of code that is in jsdom is, in fact, the complete implementation of everything that is in the browser, minus the rendering itself!

Line 10, which calls ReactDom to render the component, also uses document (and window), since ReactDom often uses them in its code.

So, who creates these global variables? Test - let's look at the code:

Before(function () ( global.document = jsdom(`

`) global.window = document.defaultView )) after(function () ( delete global.window delete global.document ))

On line 3, we create a simple document that only contains a div .

On line 4, we create a global window for the object. This is what React needs.

The cleanup function will remove these global variables so they don't take up memory.

Ideally, the document and window variables should not be global. Otherwise, we won't be able to run tests in parallel with other integration tests because they will all overwrite global variables.

Unfortunately they have to be global - React and ReactDom need document and window to be exactly that, since you can't pass them to them.

Event Handling

What about the rest of the test? Let's get a look:

ReactDom.render(e(CalculatorApp), document.getElementById("container")) const displayElement = document.querySelector(".display") expect(displayElement.textContent).to.equal("0") const digit4Element = document. querySelector(".digit-4") const digit2Element = document.querySelector(".digit-2") const operatorMultiply = document.querySelector(".operator-multiply") const operatorEquals = document.querySelector(".operator-equals" ) digit4Element.click() digit2Element.click() operatorMultiply.click() digit2Element.click() operatorEquals.click() expect(displayElement.textContent).to.equal("84")

The rest of the test tests a scenario where the user presses "42*2=" and should get "84".

And it does it in a nice way - it gets the elements using the famous querySelector function and then uses click to click on them. You can even create an event and fire it manually using something like:

Varev = new Event("keyup", ...); document.dispatchEvent(ev);

But the built-in click method works, so we use it.

So simple!

The astute will note that this test tests exactly the same thing as the E2E test. This is true, but note that this test is about 10 times faster and is synchronous in nature. It is much easier to write and much easier to read.

And why, if the tests are the same, do you need an integration one? Well, simply because it's a school project, not a real one. The two components make up the entire application, so integration and E2E tests do the same thing. But in a real application, an E2E test consists of hundreds of modules, while integration tests include several, maybe 10 modules. Thus, in a real application there will be about 10 E2E tests, but hundreds of integration tests.

Article 3 talked about traditional tests. Definitions of homogeneous and heterogeneous tests were also given there. In today's article - material on non-traditional tests, which include integrative, adaptive, multi-stage tests and the so-called tests with a criterion-oriented interpretation of the results.

1. Integrative tests

An integrative test can be called a test consisting of a system of tasks that meet the requirements of an integrative content, a test form, an increasing difficulty of tasks aimed at a generalized final diagnosis of the readiness of a graduate of an educational institution. Diagnostics is carried out by presenting such tasks, the correct answers to which require integrated (generalized, clearly interconnected) knowledge of two and more academic disciplines. The creation of such tests is given only to those teachers who have knowledge of a number of academic disciplines, understand the important role of interdisciplinary connections in learning, are able to create tasks, the correct answers to which require knowledge from students. various disciplines and the ability to apply such knowledge.

Integrative testing is preceded by the organization of integrative learning. Unfortunately, the currently existing class-lesson form of conducting classes, combined with excessive fragmentation of academic disciplines, together with the tradition of teaching individual disciplines (rather than generalized courses), will hamper the introduction of an integrative approach into the learning process and control of readiness for a long time to come. The advantage of integrative tests over heterogeneous ones lies in the greater informative content of each task and in the smaller number of tasks themselves. The need to create integrative tests increases as the level of education and the number of disciplines studied increase. Therefore, attempts to create such tests are noted mainly in higher education. Particularly useful are integrative tests to improve the objectivity and efficiency of the final state certification of pupils and students.

The methodology for creating integrative tests is similar to the methodology for creating traditional tests, with the exception of the work on determining the content of tasks. To select the content of integrative tests, use expert methods is mandatory. This is due to the fact that only experts can determine the adequacy of the content of tasks to the goals of the test. But, first of all, it will be important for the experts themselves to decide on the goals of education and the study of certain educational programs, and then agree among themselves on fundamental issues, leaving for examination only variations in understanding the degree of significance individual elements in the overall structure of preparedness. Agreed, on fundamental issues, selected composition of experts in foreign literature often a panel. Or, given the differences in the sense of the last word, in Russian, such a composition can be called a representative expert group. The group is selected to adequately represent the approach used in creating the respective test.

2. Adaptive tests

The expediency of adaptive control follows from the need to rationalize traditional testing. Every teacher understands that it is not necessary for a well-prepared student to be given easy and very easy tasks. Because it's too likely right decision. In addition, lightweight materials do not have a noticeable developmental potential. Symmetrically, due to the high probability of a wrong decision, it makes no sense to give difficult tasks to a weak student. It is known that difficult and very difficult tasks reduce the learning motivation of many students. It was necessary to find a comparable, in one scale, measure of the difficulty of tasks and a measure of the level of knowledge. This measure was found in the theory of pedagogical measurements. The Danish mathematician G. Rasch called this measure the word "logit" (1). After the advent of computers, this measure formed the basis of the method of adaptive knowledge control, which uses methods to regulate the difficulty and number of tasks presented, depending on the response of students. If the answer is successful, the computer selects the next task more difficult, if unsuccessful, the next task is easy. Naturally, this algorithm requires preliminary testing of all tasks, determining their measure of difficulty, as well as creating a bank of tasks and a special program.

The use of tasks corresponding to the level of preparedness significantly increases the accuracy of measurements and minimizes the time of individual testing to about 5 - 10 minutes.

In Western literature, there are three variants of adaptive testing. The first is called pyramid testing. In the absence of preliminary assessments, all subjects are given a task of medium difficulty, and only then, depending on the answer, each subject is given an easier or harder task; at each step it is useful to use the rule of dividing the scale of difficulty in half. In the second option, the control begins with any desired level of difficulty for the test subject, with a gradual approach to the real level of knowledge. The third option is when testing is carried out by means of a bank of tasks divided by difficulty levels.

Thus, the adaptive test is a variant automated system testing, in which the parameters of difficulty and differentiating ability of each task are known in advance. This system is created in the form of a computer bank of tasks, ordered in accordance with the characteristics of the tasks of interest. The most main characteristic tasks of an adaptive test is the level of their difficulty, obtained empirically, which means: before getting into the bank, each task is empirically tested on a sufficiently large number of typical students of the contingent of interest. The word "contingent of interest" is intended to represent here the meaning of the concept known in science of the more rigorous concept of "general population".

The origins of the adaptive approach can be traced back to the pedagogical works of Comenius, Pestalozzi and Diesterweg, who are united by the ideas of nature-based and humane learning. At the center of their pedagogical systems was the Apprentice. For example, in the little-known work of A. Diesterweg (2) “Didactic Rules” (Kiev, 1870), one can read the following words: “Teach according to nature ... Teach without gaps ... Start teaching where the student left off ... Before to start teaching, you need to explore the point of departure ... Without knowing where the student stopped, it is impossible to decently teach him. Insufficient awareness of the real level of knowledge of students and natural differences in their ability to assimilate the offered knowledge became main reason the emergence of adaptive systems based on the principle of individualization of learning. This principle is difficult to implement in the traditional classroom form.

Before the advent of the first computers, the most well-known system close to adaptive learning was the so-called “Full Learning System”. It has already been written about in USh No. 26/99.

3. So-called "criteria-based tests"

This is a very conditional, and in principle, incorrect name for a group of tests that have received some distribution and recognition in our country. Unfortunately, even an attempt was made to introduce this name into the text of our laws on attestation and standards, which was opposed by the author of this article (3). In essence, we are not dealing with tests, but with a kind of interpretation of test results.

If the main task is the desire to find out what elements of the content of the academic discipline are mastered by one or another subject, then this is the case pedagogical approach to the interpretation of test results. At the same time, it is determined what from the general set of tasks (according to the English Domain) the subject knows and what he does not know. The interpretation of the results is carried out by teachers, in the language of the academic discipline.

The conclusion is built along a logical chain: the content of the academic discipline ® (this is the sign of the arrow, as it looks here) the general set of tasks for measuring knowledge ® test, as a sample of tasks from this set, the answers of the subject ® probabilistic conclusion about his knowledge of the academic discipline. When focusing on such tests, a large number of tasks and a fairly complete definition of the content of the discipline being studied are required. The interpretation of the results is carried out by subject teachers.

The debate revolves around two main questions:

1) the correctness of the content of the test, which means the accuracy of the wording of its tasks, subject-scientific validity, the admissibility of the test to test the knowledge of interest in this group of subjects. When arguing in favor of a particular test, subject teachers rely on the conceptual apparatus, language principles and, in general, on the knowledge of the academic discipline they teach. In such cases, one speaks of tests with a content-oriented interpretation of the results (4). This is the so-called case of Domain Referenced Testing, which can be translated as the correlation of knowledge based on test results with knowledge, complete list which is represented in the general population (domain).

2) the validity of the assessment of knowledge throughout subject, based on the results of testing subjects on a small sample of test items; a sample from a potentially or actually existing general population of all tasks that could be given to subjects for a confident and reasonable assessment. In fact, this is a question of justifying the accuracy of the inductive inference about knowledge of a large number of questions based on answers for a small number of test items.

The second type of tests is related to the focus on such specific goals and objectives, such as checking the level of assimilation of a relatively short list of required knowledge, skills and abilities, which act as a given standard or assimilation criterion. For example, for certification of graduates educational institutions it is important to have tasks that allow us to draw a conclusion about the minimum acceptable competence of graduates. Abroad, they are called so: Minimum Competency Tests. When checking the minimum allowable level of knowledge, the content of tasks is fundamentally lightweight. Since such tasks must be performed by all graduates admitted by the educational institution for certification, it is impossible to talk about tests here as a method of objective and effective measurement of subjects with different levels of preparedness, in strict sense concept of test. This approach has been developed for educational authorities, who are faced with the need to quickly check the state of education in a large number of educational institutions, and not allow the latter to fall below the maximum permissible level of requirements.

In Western literature, in such cases, they talk about tests with a criterion-oriented interpretation of the results. The conclusion is built along a logical chain: tasks - answers - conclusions about the compliance of the test subject with a given criterion. Criteria-oriented interpretation means comparison of the content of attestation materials with the results of testing and the conclusion - which of the given standard, in terms of requirements, and at what level, is actually mastered.

With a criterion-oriented interpretation, a slightly smaller number of tasks is required, through which it is determined what the subject knows and what he does not know from a given standard. In other words, here the answers are not evaluated relative to the entire area (Domain) of the required knowledge, but only from the area limited by a specific standard or level (criterion) of knowledge. As in the case of Domain Referenced Testing, the interpretation of the results is carried out in the language of the academic discipline, but already mainly by employees of the education authorities and those teachers whose opinions the managers rely on during certification.

According to the author, the "tests" used in this case do not correspond to the true test requirements for traditional and adaptive tests. With a criterion-oriented interpretation, in order to diagnose a predetermined level of readiness, essentially, not tests are used, in the traditional sense of this method, but sets of tasks in a test and in another form, nothing more. The word is one, but the meaning is different. "Tests" with a criterion-oriented interpretation are often contrasted with tests with the so-called norm-oriented interpretation of the results. In fact, the latter are traditional tests, some of which have parallel variants.

Literature

Rasch, G. Probabilistic Models for Some Intelligence and Attainment Tests. With a Foreword and Afteword by B.D. wright. The Univ. of Chicago Press. Chicago & London, 1980. 199 pp. For a more accurate perception of the meaning of the concept of "logit", some formalisms may be useful. In essence, G. Rush introduced two measures: “knowledge level log” and “task difficulty level log”. He defined the first one as the natural logarithm of the ratio of the proportion of the correct answers of the test subject, to all tasks of the test, to the proportion of incorrect answers, and the second - as the natural logarithm of another ratio - the proportion of incorrect answers to the test task to the proportion of correct answers to the same task, for a set of subjects.
Diesterweg A. "Didactic Rules" (Kiev, 1870)
See, for example, the article: Avanesov V.S. "Educational standards need to change." USh, No. 46, December 1998
Lively W.(Ed). Domain Referenced Testing. Educational Technology Publications. Englewood Cliffs, N-J, 1974.
Berk R.A. (Ed). A Guide to Criterion - Referenced Test Construction. The John Hopkins Univ. Press, Baltimore, 1984.

Annotation: The lecture is the second of three examining the levels of the verification process. The topic of this lecture is the process of integration testing, its tasks and goals. Organizational aspects of integration testing are considered - structural and temporal classification of integration testing methods, integration testing planning. The purpose of this lecture: to give an idea of the process of integration testing, its technical and organizational components

20.1. Tasks and goals of integration testing

The result of testing and verification of the individual modules that make up the software system should be the conclusion that these modules are internally consistent and meet the requirements. However, individual modules rarely function on their own, so the next task after testing individual modules is to test the correctness of the interaction of several modules combined into a single whole. Such testing is called integration. Its purpose is to make sure that the system components work together correctly.

Integration testing also called system architecture testing. On the one hand, this name is due to the fact that integration tests include checks of all possible types interactions between software modules and elements that are defined in the system architecture - thus, integration tests check completeness interactions in the tested implementation of the system. On the other hand, the results of integration tests are one of the main sources of information for the process of improving and refining the system architecture, inter-module and inter-component interfaces. That is, from this point of view, integration tests check correctness interactions of system components.

Two modules can serve as an example of checking the correctness of interaction, one of which accumulates protocol messages about received files, and the second one displays this protocol on the screen. In the functional requirements for the system, it is written that messages should be displayed in reverse chronological order. However, the message storage module stores them in direct order, and the output module uses the stack to output to reverse order. Unit tests that affect each module individually will not give any effect here - the opposite situation is quite real, in which messages are stored in reverse order, and output using a queue. You can detect a potential problem only by checking the interaction of modules using integration tests. key point here is that the system as a whole outputs messages in reverse chronological order, i.e., by checking the output module and finding that it outputs messages in direct order, we cannot guarantee that we have found a defect.

As a result of integration testing and elimination of all identified defects, a consistent and complete architecture of the software system is obtained, i.e. it can be considered that integration testing is the testing of architecture and low-level functional requirements.

Integration testing, as a rule, is an iterative process in which the functional of an increasingly larger set of modules is tested.

20.2. Organization of integration testing

20.2.1. Structural classification of integration testing methods

As a rule, integration testing is carried out after the completion of unit testing for all integrated modules. However, this is not always the case. There are several methods for conducting integration testing:

upward testing;
monolithic testing;
top-down testing.

All of these techniques are based on knowledge of the architecture of the system, which is often depicted in the form of block diagrams or function call diagrams. Each node in such a diagram represents a software module, and the arrows between them represent call dependencies between modules. The main difference between integration testing methodologies lies in the direction of movement along these diagrams and in the breadth of coverage in one iteration.

Upstream testing. When using this method, it is assumed that all the software modules that make up the system are first tested and only then they are combined for integration testing. With this approach, the localization of errors is greatly simplified: if the modules are tested separately, then the error in their joint work is the problem of their interface. With this approach, the scope of the search for problems for the tester is quite narrow, and therefore the probability of correctly identifying the defect is much higher.

Rice. 20.1.

However, ascending method testing has a significant drawback - the need to develop a driver and stubs for unit testing before carrying out integration testing and the need to develop a driver and stubs during integration testing of part of the system modules (Fig. 20.1)

On the one hand, drivers and stubs are a powerful testing tool, on the other hand, their development requires significant resources, especially when the composition of integrated modules changes, in other words, one set of drivers for unit testing of each module, a separate driver and stubs for testing the integration of two modules may be required. from a set, a separate one for testing the integration of three modules, etc. This is primarily because module integration removes the need for some stubs and also requires a driver change that supports new tests that affect multiple modules.

Monolithic Testing suggests that the individual components of the system have not been seriously tested. Main advantage this method- no need to develop a test environment, drivers and stubs. After the development of all modules, their integration is carried out, then the whole system is checked. This approach should not be confused with system testing, which is the subject of the next lecture. Despite the fact that monolithic testing checks the operation of the entire system as a whole, the main task of this testing is to determine the problems of interaction between individual modules of the system. The task of system testing is to evaluate the qualitative and quantitative characteristics of the system in terms of their acceptability for the end user.

Monolithic Testing has a number of serious shortcomings.

It is very difficult to identify the source of the error (identify the erroneous piece of code). Most modules should be assumed to have a bug. The problem boils down to determining which of the errors in all involved modules led to the result. In this case, the imposition of error effects is possible. Also, a bug in one module can block testing of another.
Difficult to organize bug fixes. As a result of testing, the tester fixes the found problem. The defect in the system that caused this problem will be fixed by the developer. Since, as a rule, units under test are written different people, the problem arises - which of them is responsible for finding the elimination of the defect? With such "collective irresponsibility" the rate of elimination of defects can drop sharply.
The testing process is poorly automated. Advantage (no additional software accompanying the testing process) turns into a disadvantage. Each change made requires that all tests be repeated.

Downward Testing assumes that the integration testing process follows development. First, only the uppermost control level of the system is tested, without lower-level modules. Then, gradually, lower-level modules are integrated with higher-level modules. As a result of applying this method, there is no need for drivers (the role of the driver is performed by a higher-level module of the system), but the need for stubs remains (Figure 20.2).

Various testing professionals different opinions about which of the methods is more convenient for real testing software systems. Jordan proves that top-down testing most acceptable in real situations, and Myers believes that each of the approaches has its own advantages and disadvantages, but in general the bottom-up method is better.

The literature often mentions the method of integration testing of object-oriented software systems, which is based on the allocation of clusters of classes that together have some closed and complete functionality. At its core, this approach is not a new type of integration testing, it just changes the minimum element resulting from the integration. When integrating modules in procedural programming languages, any number of modules can be integrated, provided that stubs are developed. When integrating classes into clusters, there is a rather loose restriction on the completeness of the cluster functionality. However, even in the case of object-oriented systems, it is possible to integrate any number of classes using stub classes.

Regardless of the integration testing method used, it is necessary to take into account the extent to which integration tests cover the functionality of the system. In the work, a method for estimating the degree of coverage based on control calls between functions and data flows was proposed. With such an assessment, the code of all modules on the structural diagram of the system must be executed (all nodes must be covered), all calls must be executed at least once (all connections between nodes on the structural diagram must be covered), all sequences of calls must be executed at least once (all paths on the structure diagram must be covered) .

From an institute course on programming technologies, I made the following classification of testing types (the criterion is the degree of code isolation). Testing happens:

Block (Unit testing) - testing of one module in isolation.
Integration Testing - testing of a group of interacting modules.
System (System Testing) - testing the system as a whole.

The classification is good and clear. However, in practice, it turns out that each type of testing has its own characteristics. And if they are not taken into account, testing becomes burdensome and is not done properly. Here I have collected approaches to real application various kinds testing. And since I am writing in .NET, the links will be to the relevant libraries.

Unit Testing

Block (modular, unit testing) testing is the most understandable for a programmer. In fact, this is testing the methods of some program class in isolation from the rest of the program.

Not every class is easy to cover with unit tests. When designing, you need to take into account the possibility of testability and class dependencies to be made explicit. To ensure testability, you can use the TDD methodology, which requires you to write the test first, and then the code for implementing the method under test. Then the architecture is testable. Dependency unraveling can be done with Dependency Injection . Then each dependency is explicitly mapped to an interface and it is explicitly defined how the dependency is injected - into the constructor, into the property, or into the method.

There are special frameworks for unit testing. For example, NUnit or the test framework from Visual Studio 2008. To be able to test classes in isolation, there are special Mock frameworks. For example Rhino Mocks. They allow interfaces to automatically create stubs for dependency classes, setting the required behavior for them.

A lot of articles have been written on unit testing. I really like the MSDN article Write Maintainable Unit Tests That Will Save You Time And Tears , which explains well and clearly how to create tests that do not become burdensome to maintain over time.

Integration testing

Integration testing, in my opinion, is the most difficult to understand. There is a definition - this is testing the interaction of several classes that perform some work together. However, it is not clear how to test according to this definition. You can, of course, build on other types of testing. But it is fraught.

If we approach it as unit testing, in which dependencies are not replaced by mock objects in tests, then we get problems. For good coverage need to write lot tests, since the number of possible combinations of interacting components is a polynomial dependence. In addition, unit tests test exactly how the interaction is carried out (see white box testing). Because of this, after refactoring, when some interaction was highlighted in new class, tests fail. A less invasive method should be used.

It is also impossible to approach integration testing as a more detailed system testing. In this case, on the contrary, the tests will be few to check all interactions used in the program. System testing is too high level.

I came across a good article on integration testing only once - Scenario Driven Tests. After reading it and Ayende's book on DSL DSLs in Boo, Domain-Specific Languages in .NET, I got an idea how to do integration testing after all.

The idea is simple. We have input data, and we know how the program should work on them. Let's write this knowledge into a text file. This will be the specification for the test data, which records what results are expected from the program. Testing will determine whether the specification matches and what the program actually finds.

I will illustrate with an example. The program converts one document format to another. Converting is tricky and with a bunch of mathematical calculations. The customer has handed over a set of typical documents that he needs to convert. For each such document, we will write a specification, where we will write down all sorts of intermediate results that our program will reach when converting.

1) Suppose there are several sections in the submitted documents. Then in the specification we can specify that the parsed document should have sections with the specified names:

$SectionNames = Introduction, Article text, Conclusion, Literature

2) Another example. When converting, you need to split geometric figures to primitives. The partition is considered successful if all the primitives in total completely cover the original figure. From the documents sent, we will select various figures and write our specifications for them. The fact that the figure is covered by primitives can be reflected as follows:

$IsCoverable = true

It is clear that in order to check such specifications, an engine would be required that would read the specifications and check their compliance with the behavior of the program. I wrote such an engine and was pleased with this approach. Soon I will lay out the engine in Open Source. (UPD: Posted)

This type of testing is integration, since the code of interaction of several classes is called during testing. Moreover, only the result of the interaction is important, not the details and order of calls. Therefore, the tests are not affected by code refactoring. There is no overtesting or undertesting - only those interactions that occur when processing real data are tested. The tests themselves are easy to maintain, as the specification is easy to read and easy to change to meet new requirements.

System testing

System testing is testing the program as a whole. For small projects this is, as a rule, manual testing - launched, clicked, made sure that (does not) work. Can be automated. There are two approaches to automation.

The first approach is to use a variation of the MVC pattern - Passive View (here's another good article on variations of the MVC pattern) and formalize user interaction with the GUI in code. Then system testing comes down to testing Presenter classes, as well as the logic of transitions between View. But there is a nuance here. If you test Presenter classes in the context of system testing, then you need to replace as few dependencies as possible with mock objects. And here comes the problem of initializing and bringing the program into the state necessary to start testing. The Scenario Driven Tests article mentioned above goes into more detail about this.

The second approach is to use special tools to record user actions. That is, as a result, the program itself starts, but clicking on the buttons is carried out automatically. For .NET, an example of such a tool is the White . WinForms, WPF and several other GUI platforms are supported. The rule is this - for each use case, a script is written that describes the user's actions. If all use cases are covered and the tests pass, then you can hand over the system to the customer. The acceptance certificate must be signed.

pedagogical test

A pedagogical test is defined as a system of tasks of a certain content, increasing difficulty, and a specific form, which makes it possible to qualitatively and effectively measure the level and evaluate the structure of students' preparedness. In the pedagogical test, the tasks are arranged as the difficulty increases - from the easiest to the most difficult.

Integrative test

Diagnostics is carried out by presenting such tasks, the correct answers to which require integrated (generalized, clearly interconnected) knowledge in the field of two or more academic disciplines. The creation of such tests is given only to those teachers who have knowledge of a number of academic disciplines, understand the important role of interdisciplinary connections in learning, are able to create tasks, the correct answers to which require students to have knowledge of various disciplines and the ability to apply such knowledge. Integrative testing is preceded by the organization of integrative learning. Unfortunately, the current class-lesson form of conducting classes, combined with excessive fragmentation of academic disciplines, together with the tradition of teaching individual disciplines (rather than generalized courses), will hamper the introduction of an integrative approach into the learning process and control of preparedness for a long time to come.

The advantage of integrative tests over heterogeneous ones lies in the greater informative content of each task and in the smaller number of tasks themselves.

Adaptive test

The adaptive test works like a good examiner. First, he "asks" a question of medium difficulty, and the answer received is immediately evaluated. If the answer is correct, then the assessment of the testee's capabilities increases. In this case, a more difficult question is asked. If the student answers the question successfully, the next one is selected more difficult, if unsuccessful - easy.

The main advantage of an adaptive test over a traditional one is efficiency. An adaptive test can determine the level of knowledge of the test-taker with fewer questions (sometimes the length of the test is reduced to 60%).

In the adaptive test, on average, more time is allocated to each question for reflection than in the regular test. For example, instead of 2 minutes per question, an adaptive test taker might have 3 or 4 minutes (depending on how many questions they have to answer).

The reliability of adaptive test results is the same as that of fixed-length tests. Both types of tests equally accurately assess the level of knowledge.

However, it is very widely believed that the adaptive test more accurately assesses the level of knowledge. This is not true.

Thematic materials: