Bennet Mead
Statistician, U.S. Department of Justice
This article, originally published in 1937, shows the Federal Probation system’s awareness very early on of both the need to measure outcomes and the problems doing so. Mr. Mead’s discussion of how to define “successful” probation and how to grapple with self-report dilemmas also reminds us that measuring outcomes has always presented inherent challenges.
IS THERE A measure of probation success? Before attempting to answer this question let us first define its meaning. This may best be done by first determining what constitutes probation success, and then considering possible methods of measuring such success.
In defining the meaning of probation success, it should first be emphasized that success is positive in its nature. Most people undoubtedly conceive of success as something far more vital than mere absence of failure. The sort of success which is no more than absence of failure, becomes a pale shadow of success, a mediocre thing scarcely better than failure.
It follows, that no satisfactory measure of probation success can be obtained merely by counting up those definite and obvious failures who are customarily referred to as probation violators, or by determining the ratio of such violators to the entire group handled on probation. This may be illustrated by some recent statistics from the experience of the Probation System of the United States Courts. During the year ending June 30, 1936, there were 997 Federal probationers who were officially declared violators, either by revocation of their probation, by their arrest, or by issuance of warrants for their arrest on charges constituting violation of the terms or their probation. During that same year, a total of 29,821 persons were at one time or another under the supervision of United States Probation officers. If we calculate the ratio of violators to the total number handled on probation, we find that in the fiscal year 1935-36, the violators formed 3.3 per cent, of the total handled.
Now this percentage does measure, after a fashion, the frequency of failure on probation, and thus it affords a crude negative measure of probation success. That is, we may say that during the year under consideration, and so far as official records are concerned, the entire group of probationers supervised, exclusive of those formally declared violators, are to be considered as probation successes. On this basis, the United States Probation System might rashly claim that 96.7 per cent of its cases on probation were handled successfully, and we might then go further and jump recklessly to the conclusion that all is well with us, and that the federal probation system is discharging as effectively as possible its task of rehabilitating the probationers under its care.
If, however, we keep in mind the true concept of success, as positive achievement rather than dead-level mediocrity, we must admit that this 96.7 per cent of alleged success does not and cannot reveal the actual extent of our achievement or lack of achievement. So long as we do not actually know whether our so-called “successes” involve any real improvement in the behavior of the probationers, we must not allow ourselves to become complaisant about the situation.
On any realistic basis, then, success must be considered as relative and variable. If therefore, we propose to measure success, we need to determine not only the bare fact of improvement in behavior, but also the degree of improvement.
Our analysis thus far has led to the definition of success as positive achievement, varying in degree. We have also incidentally discussed one unsatisfactory measure of probation success, the percentage of non-violators.
It remains now to consider what better methods are available for measuring probation success, that is, the degree or extent of improvement. Nearly three years ago, in the Central Office of the United States Probation System, we began to experiment with a very crude and approximate device for measuring the degree of probation success. This device consists in asking the probation officers to estimate for each probationer passed from supervision the degree of his improvement while on probation.
In undertaking this experiment, we frankly recognized that this device was in no sense an accurate or scientific measuring device. But it seemed to us that it would be worth while to experiment with it, primarily as a means for insuring some systematic self-criticism by the federal probation officers of their own work and its results. There is no question that this plan has been of value from this point of view.
Up to the present time, however, we have hesitated to publish any actual statistics based upon these reports on outcome. It has been apparent from our critical examination of them, that no uniform standards have been applied by the various officers in forming their judgment as to the degree of improvement in their charges. While this advice has had considerable educational value, we must admit that it has not served as an accurate measure of success.
Perhaps we can gain some idea of the inadequacies of this measuring device by looking at some of the results obtained by it for the fiscal years ending June 30, 1936. Probation officers were asked to rate the outcome of probationary treatment for those passed from their supervision in one of five degrees, as showing striking improvement, moderate improvement, slight improvement, no improvement, or as violating probation. Out of a total of 6,298 cases thus classified 1,685, or 26.8 per cent, were classed as having achieved striking improvement, and 2,183, or 34.7 per cent, were classed in the moderate improvement group. It would, however, be most unsafe to accept these figures as giving a true picture. Rather do they indicate over-optimism on the part of the probation officers making the evaluation.
The greatest deficiency of this method of measuring success is its subjective quality, for it suffers two-fold in that it relies not only on a man’s judgment of another man, but also on a man’s judgment of himself and his own work. In this connection, it is interesting to note, that over-optimism in regard to results is not limited to the less trained and qualified probation officers. In order to check this point, we compared the rating of outcome for fourteen selected probation units. The comparison for these selected units and for all other units is very interesting, because it shows no marked variation.
In the totals we find that the selected units reported 24.2 per cent of their cases as showing striking improvement. The remaining units classified 27.5 per cent of their cases in this group. The figures for the “moderate improvement” group reveal a similar situation, with 34.1 per cent of the cases reported by the selected units in this group, and 34.8 per cent of the cases for the remaining units in this group. Thus we see that the mass results obtained with this device differ only slightly whatever may be the quality of the personnel. It is apparent that the difficulty lies in basic weaknesses of our measuring instrument, in that ratings are made on a subjective or “hunch” basis.
It is probable that this method could be improved by making the classification of outcomes the special responsibility of some one member of each probation staff, who would be more capable of a judicial viewpoint than the officer in charge of a particular case. I believe that we should continue to experiment with this device, if possible introducing changes which will lead to greater uniformity and accuracy in the results. But even if we were successful in refining this procedure of self-evaluation which I have just described, it would still be far from adequate.
Before a reliable evaluation of outcomes can be made, it is necessary that probation departments institute a thorough system of case study for each individual who comes under supervision. No plan of evaluation can be considered accurate which does not reveal what types of cases are better and worse in terms of social adjustment at the end of the probation period. The individual offender, then, must be the unit of evaluation, and it follows that the case study method should be used whenever possible.
Many probation departments may have been frightened away from this method of approach because of a mistaken view as to the minimum requirements in personnel and organization for this type of work. Any department which has a trained and qualified staff could at least experiment with the case study method of evaluation, by selecting a part of its cases for intensive supervision and study.
In order to use the case study method to best advantage, it is necessary to design a special form of case summary to bring together data which will show the status of the probationer at various times. This progress record should start at the time he is placed on probation, with records of his physical and mental condition, education, recreational habits, industrial experience, and family and community conditions. Similar analyses should be made at various intervals during the probation period, at the time of discharge, and if feasible, a year or more after discharge. The progress record should also cover the facts about the treatment program, as it was attempted and as it was actually executed, and any changes in the behavior of the probationer.
The purpose of such a progress record as the one outlined will be to compare the status of an individual at various times during the probation period rather than to compare two individuals at any given time. At first, it will probably be necessary to treat every personality phase separately. In this way, we may be able to record changes in the personality of the probationer, and also to compare individuals in regard to social attitudes and usefulness. Some psychologists now claim to have developed personality rating scales which test emotional as well as intellectual factors.
We must bear in mind that even after the progress record has been made available for practical use, it will not work automatically. On the contrary, a high degree of technical skill will be necessary to secure accurate and consistent results. Likewise the analysis and interpretation of these records will require much statistical training and experience. The task of evaluation will require effective collaboration between a number of professional groups, including experts in vocational guidance and education, psychologists, psychiatrists, doctors, prison and probation administrators, as well as experts in social research. It is only through this many-sided approach that we can hope to achieve a truly significant evaluation, for no one professional group is capable of fulfilling this task without the assistance of many others.
The system of classification in the federal prison system follows the general theory I have outlined, in dealing with offenders admitted to the institution. But no such procedure has as yet been instituted for the Federal Probation System. However there are in the country some few probation departments which attempt this type of work. I understand that the Probation Department in the Court of General Sessions of New York City follows in general the principles here outlined, as do the Probation offices of the Essex County, New Jersey, Court of Common Pleas and the Westchester County, New York, Probation Department, to mention only a few.
Agencies and institutions dealing with crime and delinquency have generally lagged far behind certain other social organizations in the use of measuring devices, specifically in the creation of adequate evaluation techniques. We must recognize that the approximate methods of evaluation in use at the present time have serious limitations. All probation workers are aware of the complexity of the crime problem. They have ample opportunity to know from actual experience that economic insecurity and unemployment, low incomes, poor housing, degrading family and neighborhood life and their surrounding conditions foster the growth and the spread of crime and delinquency.
It is necessary that we see probation in its proper relation to all the other essential elements in a program of crime control. We must not expect too much from this device, nor must we be content with too little. It seems to me reasonable to hope that thorough, scientific evaluation of probation work may disclose further facts about the underlying social and economic causes of crime and thus stimulate action for crime prevention. Even though as yet we have not put into practice methods of probation evaluation which can be relied upon, the ultimate ends to be gained in devising a satisfactory system of this kind warrant all the effort and attention which we can give it.
At the time when we introduced the reporting of degree of improvement in the United States Probation System, we realized clearly the need for providing some yardstick or standard which would help the probation officers to make their estimates of improvement on the basis of the specific nature of the improvement needed in each case. Accordingly we introduced at the same time, beginning in July, 1934, the plan of having the probation officers report, for each probationer received for supervision, the particular obstacles and handicaps affecting the probationer, which need to be overcome if the case were to be successfully handled.
For the fiscal years ending June 30, 1935 and 1936, we have tabulated and included in our annual reports summaries of the information furnished by probation officers concerning handicaps and obstacles. This appraisal of handicaps and obstacles should in the course of time enable the probation officers to make more accurate judgments as to the degree of improvement, since the information concerning the initial obstacles and handicaps should be available in the case records for increasing numbers of probationers who are passed from supervision. Up to the present time, however, the record of obstacles and handicaps has not been available for the large majority of probationers passed from supervision. This condition may help to explain some of the deficiencies in rating the degree of improvement.
In conclusion, we may summarize by saying that we have undertaken to define the meaning of probation success and to determine what methods are available for measuring probation success. We have defined success for the purposes of this discussion as being positive in its nature and decidedly variable in its degree. This has led us to the conclusion that we cannot measure success by counting up the percentage of failure, and that mere probation violation rates are therefore of very little use for measuring probation success.
We have briefly reviewed the experience of the United States Probation System in attempting to measure probation success by classifying probationers at the end of their periods of supervision according to the degree of improvement in their behavior, as judged by the federal probation officers. This experimental procedure we have found to be of educational value, but of no real scientific value. The failure of this device to yield satisfactory results has apparently been due to the unavoidable tendency of our probation officers, in common with other mortals, to be over-optimistic in appraising the results of their own work. None the less, this device has proven of sufficient value to suggest that it should be refined and improved rather than abandoned.
Ultimately the measurement of degree of improvement needs to be done by persons other than the officers responsible for case supervision and the rating needs to be done in terms of careful appraisal of the improvement made in terms of specific traits of personality and specific phases of conduct.
Some progress has been made, but a tremendous amount of work remains to be done before we can hope to make any scientific evaluation of outcomes. The keeping of systematic and detailed records of the status and the progress of each probationer from the time he is placed on probation will pave the way for increasingly accurate measurement of the degree of success of individual probationers.
