Prévia do material em texto
ORIGINAL RESEARCH ARTICLE Culture and Brain (2022) 10:167–193 https://doi.org/10.1007/s40167-022-00111-6 Abstract In modern neuropsychology, the acknowledgment of cultural variables and their significant influence on cognitive development and performance spawned the de- velopment of cross-cultural neuropsychological tests. This article describes the de- velopment and psychometric properties of the Multicultural Neuropsychological Scale (MUNS), a screening scale that includes stimuli common to most cultures in the world. It is appropriate for use with adults and the elderly and with lower and higher education participants. In the validity study, we compared the performances of control and cognitively impaired participants. Test reliability was assessed using the standardized regression-based method (n = 71). Norms were developed using a regression-based method. One hundred and eighty-four Spanish-speaking par- ticipants of both sexes were recruited for the normative sample. Participants were between 15 and 80 years old. The education range was 4–20 years of schooling. Evidence for its cross-cultural utility was obtained by comparing the performance of two (Argentinian and American) age and education matched samples. Mean dif- ferences between the control and clinical groups were significant, yielding a large effect size (η2 = 0.20). Raw MUNS retest total scores were predicted by MUNS pre- test total score and age. Age, reading fluency, and years of schooling significantly influenced test scores. The validity study confirmed that the test discriminates be- tween individuals with and without cognitive impairments, including participants with mild cognitive impairments. Reliability is satisfactory. The performance of both samples showed no significant differences between them in all subtests except for one. Keywords Multicultural neuropsychological scale · Validity · Reliability · Normative data Accepted: 12 June 2022 / Published online: 8 August 2022 © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2022 The multicultural neuropsychological scale (MUNS): validity, reliability, normative data and cross-cultural evidence Alberto Luis Fernández1,2 · Gabriel Jáuregui Arriondo1 · Maximiliano Folmer3 · Marcelo Vaiman1,2 · Gazul Rotela Leite1 · David J. Hardy4,5 Extended author information available on the last page of the article 1 3 http://orcid.org/0000-0003-4636-1066 http://orcid.org/0000-0003-4200-9997 http://orcid.org/0000-0001-8813-8036 http://orcid.org/0000-0003-2469-0088 http://orcid.org/0000-0002-9328-6209 http://crossmark.crossref.org/dialog/?doi=10.1007/s40167-022-00111-6&domain=pdf&date_stamp=2022-8-2 A. L. Fernández et al. Introduction Increasing migratory movements resulted in an estimated population of 271,642,105 migrants around the world in 2019 (United Nations, 2019). Brain impairing condi- tions that may affect migrants, from traffic or workplace accidents to even stroke and dementia, can and will affect a portion of migrants everywhere. Consequently, many of them will need neuropsychological services. Unfortunately, most current neuro- psychological tests are not appropriate for the assessment of migrant populations. As an overwhelming majority have been developed in Western cultures, neuropsycho- logical tests contain culturally specific information that makes them inappropriate for use on populations not pertaining to these cultures. Examples of culturally biased tests include the Trail Making Test, the Boston Naming Test, the Weschler Memory Scale (WMS), and the California Verbal Learning Test. Examples of how cultural bias is woven into the fabric of these tests are easy to find. For example, in the WMS, the logical memory section tells participants a story of a fictional character named Anna Thompson, who lives in South Boston. The name and city of this narrative’s character, however commonplace they may be in the United States, are unfamiliar to test givers and test subjects who are from or live in the Middle East, Asia or Africa. Despite only being appropriate for Western cultures, tests such as the WMS are still some of the most used around the world. Not only are these tests not culturally appropriate, but they are also not devel- oped for the assessment of participants with lower levels of education. Psycholo- gists at Western universities designed most of these tests and then used them on the population sample most easily accessible in their context: young and highly educated students. This population is known for being WEIRD–Western, Educated, Industrial- ized, Rich, and Democratic–and therefore not representative of the world as a whole (Henrich, Hein, & Norenzayan, 2010). The estimated world literacy rate is 86%, which means that around 750 million adults, cannot read or write (UNESCO Institute for Statistics, 2017). Furthermore, according to the United Nations, one-third of the 244 million migrants estimated in the world in 2013 only had a few years of educa- tion, none of which took place at a level higher than secondary education (United Nations-OECD, 2013). It is apparent that current neuropsychological tests are appro- priate only for a minority of the world population. The most common answer to resolve these issues has been adapting tests, but test adaptation involves a long and costly process that is usually not affordable in envi- ronments where neuropsychology is not well developed (Fernandez & Evans, 2022). In addition, while these attempts often address the issue of cultural bias, the issue of inadequacy for testing populations with lower education levels urgently remains on the table. An alternative solution is the development of cross-cultural tests (CCTs) which are tests developed from the outset for use in different cultural settings. They include universal stimuli that are common to most cultures, in an attempt to avoid cultural specifics. For example, CCTs avoid the use of letters of the alphabet, proper names, currency or specific geographic references such as cities or regions. This approach does not resolve all issues since some cognitive domains require cultural or language- specific tests. For instance, when it comes to language assessment, it is impossible to 1 3 168 The multicultural neuropsychological scale (MUNS): validity, reliability,… conduct a deep analysis of language functions such as grammar, syntax or vocabulary, without a test specifically designed for the target language. Such characteristics vary in each language. Despite these limitations, CCTs allow for the development of a tool that is easy to translate and most likely demands few or no adaptations. This adapt- ability is particularly important in regions of the world in which neuropsychology is not well developed. In addition, it is highly convenient for use in countries where neuropsychologists must frequently assess migrants, asylum seekers, refugees and immigrants. As it stands, fair and accurate client assessment is more feasible when using evaluation tools created with culturally appropriate stimuli and designed in the native languages of both neuropsychologists who administer tests and the migrant populations that take them. There are a number of successful CCTs, including the Rowland Universal Demen- tia Assessment (RUDAS), the Cross-Cultural Dementia Screening (CCD), and the European Cross-Cultural Neuropsychological Test Battery (CNTB). RUDAS, a brief cognitive screening test created to minimize effects of cultural learning and language differences when it comes to evaluating baseline cognitive performance, has dem- onstrated high sensitivity across multicultural samples and has been translated into around a dozen languages (Komalasari, Chang & Traynor, 2019). The CCD, a neu- ropsychological tool designed in Europe for the screening of dementia in immigrant populations with lower education levels, can be utilized without an interpreterand Francis. 1 3 191 http://dx.doi.org/10.1590/1980-57642021dn15-030008 A. L. Fernández et al. Goudsmit, M., Uysal-Bozkir, O., Parlevliet, J. L., van Campen, J. P.C.M., de Rooij, S. E. & Schmand, B. (2017) The Cross-Cultural Dementia Screening (CCD): A new neuropsychological screening instru- ment for dementia in elderly immigrants. Journal of Clinical and Experimental Neuropsychology, 39 (2), 163–172. Hammers, D. B., & Duff, K. (2019). Application of Different Standard Error Estimates in Reliable Change Methods. Archives of Clinical Neuropsychology. doi:10.1093/arclin/acz054 Henrich, J., Hein, S. J., & Norenzayan, A. (2010). The weirdest people in the world? The Behavioral and Brain Sciences, 33(2–3), 61–135. Hinton-Bayre A. D. (2010). Deriving Reliable Change Statistics from test-retest Normative data: Compari- son of Models and Mathematical Expressions. Archives of clinical neuropsychology, 25(3), 244–256. https://doi.org/10.1093/arclin/acq008 Julayanont, P. & Ruthirago, D. (2016). The illiterate brain and the neuropsychological assessment: From the past knowledge to the future new instruments. Applied Neuropsychology, 25(2), 174–187. doi: 10.1080/23279095.2016.1250211 Koepsell, T. D., & Monsell, S. E. (2012). Reversion from mild cognitive impairment to normal or near-normal cognition: Risk factors and prognosis. Neurology, 79(15), 1591–1598. doi:10.1212/ wnl.0b013e31826e26b7 Komalasari, R., Chang, H. C., & Traynor, V. (2019). A review of the Rowland Universal Dementia Assess- ment Scale. Dementia, 18(7–8), 3143–3158. Lezak, M. D., Howieson, D.B., Bigler, E.D. & Tranel, D. (2012) Neuropsychological Assessment. New York: Oxford. McSweeny, A. J., Naugle, R. I., Chelune, G. J., & Luders, H. (1993). “T scores for change”: An illustration of a regression approach to depicting change in clinical neuropsychology. The Clinical Neuropsy- chologist, 7, 300–312. Naveh-Benjamin M. (2000) Adult age differences in memory performance: Tests of an associative deficit hypothesis. Journal of Experimental Psychology: Learning, Memory & Cognition, 26(5),1170–1187. Nielsen, T. R., Segers, K., Vanderaspoilden, V., Bekkhus-Wetterberg, P., Minthon, L., Pissiota, A., Walde- mar, G. (2018). Performance of middle-aged and elderly European minority and majority populations on a cross-cultural neuropsychological test battery (CNTB). The Clinical Neuropsychologist. doi:10 .1080/13854046.2018.1430256. Nell, V. (2000). Cross-cultural neuropsychological assessment. Theory and practice. New Jersey: Law- rence Erlbaum Associates. Oberg, G. & Ramírez, M. (2006). Cross-linguistic meta‐analysis of phonological fluency: Normal perfor- mance across cultures. International Journal of Psychology, 41(5), 342–347 Rosselli, M., Ardila, A., Salvatierra, J., Marquez, M., Matos, L., & Weekes, V. A. (2002) A Cross-linguistic Comparison of Verbal Fluency Tests. Intern. J. Neuroscience, 112, 759–776, Shapiro, S. S., & Wilk, M. B. (1965). An Analysis of Variance Test for Normality (Complete Samples). Biometrika, 52(3/4), 591–611. https://doi.org/10.2307/2333709 Swadesh, Morris. (1971). The Origin and Diversification of Language. Chicago: Aldine Testa, S. M., Winicki, J. M., Pearlson, G. D., Gordon, B., Schretlen, D. J. (2009). Accounting for Estimated IQ in Neuropsychological Test performance with regression-based techniques. Int Neuropsychol Soc, 15(6), 1012–22. doi: 10.1017/S1355617709990713. UNESCO Institute for Statistics. (2017). Literacy Rates Continue to Rise from One Generation to the Next. Fact Sheet,45. Retrieved from: http://uis.unesco.org/sites/default/files/documents/fs45-liter- acy-ratescontinue-rise-generation-to-next-en-2017_0.pdf United Nations- Organisation for Economic Co-operation and Development. (2013). World Migration in Figures. Retrieved from: https://www.oecd.org/els/mig/World-Migration-in-Figures.pdf United Nations. (2019) International Migrant stock 2019. Retrieved from: https://www.un.org/en/develop- ment/desa/population/migration/data/estimates2/estimates19.asp Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. 1 3 192 http://dx.doi.org/10.1093/arclin/acq008 http://dx.doi.org/10.2307/2333709 http://uis.unesco.org/sites/default/files/documents/fs45-literacy-ratescontinue-rise-generation-to-next-en-2017_0.pdf http://uis.unesco.org/sites/default/files/documents/fs45-literacy-ratescontinue-rise-generation-to-next-en-2017_0.pdf https://www.oecd.org/els/mig/World-Migration-in-Figures.pdf https://www.un.org/en/development/desa/population/migration/data/estimates2/estimates19.asp https://www.un.org/en/development/desa/population/migration/data/estimates2/estimates19.asp The multicultural neuropsychological scale (MUNS): validity, reliability,… Authors and Affiliations Alberto Luis Fernández1,2 · Gabriel Jáuregui Arriondo1 · Maximiliano Folmer3 · Marcelo Vaiman1,2 · Gazul Rotela Leite1 · David J. Hardy4,5 Alberto Luis Fernández neuropsicologia.filo@ucc.edu.ar 1 Department of Neuropsychology, Universidad Católica de Córdoba, Córdoba, Argentina 2 Psychometrics Department, Universidad Nacional de Córdoba, Córdoba, Argentina 3 School of Psychology, University of Padua, Padua, Italy 4 Department of Psychological Science, Loyola Marymount University, Los Angeles, USA 5 Department of Psychiatry and Biobehavioral Sciences, University of California, Los Angeles, Los Angeles, USA 1 3 193 The multicultural neuropsychological scale (MUNS): validity, reliability, normative data and cross-cultural evidence Abstract Introduction Methods Study 1. Development of materials Sample Procedure Data analysis Study 2. Validity Study 3. Reliability Study 4. Normative study Study 5. Cross-cultural comparison Results Discussion Appendix Referencespres- ent (Goudsmit et al., 2017). The CNTB, created for use on populations in Western Europe to identify cognitive impairments for various subtypes of dementia, is yet another example of a successful CCT (Nielsen et al., 2018). These tests are all exam- ples of CCTs that are useful when evaluating immigrant populations. However, most current CCTs, including the examples previously mentioned, are aimed at assessing the elderly as most of them have been devised to diagnose demen- tia. There are no CCTs developed for adults or young adults to date. The Multicul- tural Neuropsychological Scale (MUNS) was developed as a CCT that is suitable for ages 15 to 90 and over (Fernandez et al., 2018). The MUNS was projected as a neuropsychological assessment tool that is (a) a cross-cultural test; (b) a screening test; (c) capable of providing more information than a brief test; (d) appropriate for use with adults and elderly; (e) able to assess major cognitive functions; (f) appropri- ate for lower and higher education subjects; (g) psychometrically robust; (h) able to accurately discriminate patients with mild cognitive dysfunctions from healthy par- ticipants; (i) easy for test administrators to implement and (j) inexpensive. The MUNS is a short scale devised with universal stimuli that are easy to trans- late into different languages. It consists of seven subtests evaluating five cognitive domains: attention, memory, executive functioning, constructional praxis, and lan- guage. In comparison to existing CCTs the MUNS was designed to convey more information. For example, three subtests cover memory, and each subtest includes several items. Moreover, some subtests, such as the Word List, have a similar format to current standard tests for memory assessment. It also has two versions, one that is appropriate for lower education participants, and another for higher education par- ticipants. This article describes the following features of the MUNS: (a) the develop- ment of materials; (b) the validity characteristics; (c) the reliability characteristics; 1 3 169 A. L. Fernández et al. (d) normative data for Argentinian higher education participants, and; (e) a cross- cultural study. Methods Study 1. Development of materials Figure 1 describes the whole process of the development of the MUNS. The MUNS includes seven subtests evaluating five cognitive domains: attention (1 subtest), memory (1 visual and 2 verbal subtests), executive functioning (1 subtest), construc- tional praxis (1 subtest) and language (1 subtest). The subtests are the following: Arrows, for attention; Personage and Word List, for verbal memory; Visual Memory; Party, for executive functioning; Dots and Lines, for constructional praxis; and Ani- mals, for language (see Table 1 for a detailed description of each test). The stimuli were selected to be understood by people from most cultures. For instance, for the Word List subtest, the words included were taken from the Swadesh Fig. 1 Flowchart of the different stages in the developement of the MUNS 1 3 170 The multicultural neuropsychological scale (MUNS): validity, reliability,… list, which is a lexicostatistical list that represents the Basic Core Vocabulary exist- ing in any language (Swadesh, 1971). Therefore, this subtest is easily translated into probably every language. The Visual Memory test includes pictures of the following elements: hand, leaf, flower and building. These are stimuli easily recognizable by habitants of most cultures. Most of the administration materials are included in a stimulus book. Five answer sheets are needed for the Party, Visual Memory, immedi- ate and delayed, and the Dots and Lines subtests. Four out of the seven subtests have Table 1 MUNS subtests description Word List subtest (WL) The version for LE comprises 10 words while for HE contains 14 words. All the words included in this list were taken from the Swadesh list, which is a lexicostatistical list that represents the Basic Core Vocabulary existing in any language. They belong to two semantic categories (natural elements such as wind or cloud, and body parts). The examiner reads the list for three learning trials. A delayed free recall trial is given after 20 minutes. A recognition trial follows the free recall. Three scores are obtained: Im- mediate (learning trials 1–3), Delayed and Recognition (individuals are presented with a list containing the learning trials words and distractors. Subjects must recognize the words in the learning trials list). Personage subtest (Per) In this verbal memory test, a short paragraph with personal information about a ficti- tious character is read to the testee whose task is to remember this information. Words for the information were also taken from the Swadesh list. There is not an immediate trial but a delayed free recall trial after 15–20 minutes and a cued recall trial with questions about the information that the testee did not remember spontaneously. Two scores can be obtained: Spontaneous Recall and Cued Recall. Visual Mem- ory subtest (VM) This subtest consists of a series of 4 pictures of universal elements (flower, leaf, hand, building). The pictures are divided into several sections and some of them are filled. Subjects are shown the picture for 10 seconds and immediately after the picture is removed they are presented with the same picture but with all the sections blank. The task of the subjects is to fill in the blanks as shown in the previous picture. There is an immediate and a delayed essay. The number of hits is computed to obtain the score for the immediate and delayed trials. Arrows sub- test (Arr) This is an attention subtest with two parts: in part I a series of pictures containing arrows pointing in different directions are shown every two seconds. The testee must count the arrows pointing to the right. In part II the subject has to count arrows point- ing up and to the left. The LE subjects are asked to carry on a single account (adding both arrows), while HE individuals are asked to keep two separate accounts. Party Subtest (P) In this executive functioning test, the subject is given a sheet containing a map of a fictitious downtown. On the map, there are marked spots indicating shops where the following items can be purchased: food, drink, silverware, table, chairs and dessert. There is more than one option where to buy these items and the price in number of coins (indicated beside each item) is different for each one. Subjects are asked to buy one item of each category trying not to exceed a 100 coins budget. In addition, the subjects have to indicate the route that they will take to buy all these items with a line on the map. They are asked to take the shortest possible route. The scoring system of the Party subtest is based on a combination of scores obtained according to the number of items purchased, the blocks traveled to complete the route, and the number of occa- sions in which the participant crosses the line over a block instead of using the streets. Dots and Lines subtest (DL) In this visuoconstructional praxis subtest subjects are shown four designs comprised of a set of points that are connected with lines. Their task is to copy the figure in a set of points that are adjacent to the design. The number of correctly connected lines is scored. Animals test (A) In this case, subjects are asked to say as many animals as they can in two minutes. The two minute period was selected on the basis that names of animal differ in their length across languages. One point per item is given. 1 3 171 A. L. Fernández et al. two versions depending on the educational level of the testee (Word List, Person- age, Party and Arrows). The development of different versions for lower and higher education participants was based on previous research showing that participants with lower education have difficulty solving some of the current neuropsychologicaltests in use around the world (Julayanont & Ruthirago, 2016; Nell, 2000). This usually produces a floor effect on their performance. The two versions differ in extension (verbal memory and executive functioning subtests) or in the cognitive load (atten- tion test). Pilot trials were run in order to determine the extension and difficulty level for each version. For example, on the Word List subtest, participants with lower edu- cation are presented with a 10 word list, while the participants with higher education are asked to remember 14 words. For the Personage subtest, participants with lower education are presented with a story containing 10 items of information about a fic- tional character, while the story for participants with higher education contains 15 items. In the Party test, lower education participants are presented with fewer options of items to purchase. While higher education participants are given three options per item, lower education participants are given just two. The educational level of the testee was based on the formal years of schooling completed. Those with less than eight years of schooling were included in the lower education group (LE); whereas those with more than seven years of schooling were included in the higher education group (HE). Sample The MUNS was administered to 72 Argentinian adults of both sexes (65% female) and an educational range between 1 and 20 years of education (M = 9.1 SD = 4.1), who gave their consent to participate. The MUNS was also administered to a participant who was illiterate. He was not amenable to testing, therefore he was not included in the analyzed data. The age range was 15–87 (M = 34.54 SD = 19.3). The sample was comprised of participants recruited from several sources: university students, individuals attending teaching programs, workers from medical institutions, teaching programs for older adults, as well as acquaintances or relatives of test administra- tors. The inclusion of each participant required that the health backgrounds of the participant were explored through a set of questions. Participants with any of the following diagnoses were excluded from this sample: stroke, loss of consciousness (at least 20 minutes), traumatic head injury, central nervous system disease, chronic renal insufficiency, hepatic encephalopathy, non-treated thyroid disease, epilepsy, non-treated high blood pressure, severe cardiac failure, severe sleep disorders, coma, diagnosed psychiatric disease or illegal drug consumption. Several participants from a rural area were included in the sample. The participants included in this study were not included in the samples of the subsequent studies. Procedure On average, administration of the Spanish version of the MUNS took between 30 and 40 minutes. Administration was performed by several undergraduate psychol- ogy students and graduate psychologists. To ensure standardized administration and 1 3 172 The multicultural neuropsychological scale (MUNS): validity, reliability,… scoring, all administrators underwent training sessions with the test developers. After reviewing test administration instructions, testers practiced administering the instru- ment, first with other testers and then with individuals who matched the sample used in the final studies. The same test administration procedure was followed for the data collection of all the studies described in this article. Test administration was carried out in different settings based on availability (classrooms, laboratories, and so forth), under standardized conditions defined as a quiet, well-lit room containing only the administrator and participant. Data analysis In order to equate the score ranges between educational groups all scores were transformed to percentages. For the Animals subtest, a hypothetical maximum of 60 points was set to obtain the percentage. The subtests scores were added to compose the total score. Study 2. Validity Sample Two groups of Argentinian participants, control (C) (n = 51) and clinical (Cl) (n = 39), were compared. All participants gave their informed consent to be tested. Both groups were matched for years of education, reading fluency, and age. Table 2 shows the demographic characteristics of both groups. The recruiting procedure and exclu- sion criteria were identical to Study 1. Handedness was distributed as follows: 86% right-handed, 12% ambidextrous, and 2% left-handed. The Cl group was comprised of diverse clinical cases including mild cognitive impairment, neurologic diseases such as epilepsy, stroke, multiple sclerosis, Parkin- son’s disease, dementia and head trauma, and a subgroup of non-neurologic con- ditions affecting cognition. This last group included participants who had diverse conditions possibly affecting cognition (i.e. hypertension, diabetes, hypothyroidism, among others), and had at least two neuropsychological tests with z scores below − 1.5. The final Cl sample was comprised of mild cognitive impairment (MCI) (n = 9), neurologic disease (n = 13) and non-neurologic conditions affecting cognition Demographic Variables Descriptive Statistics C n = 51 Cl n = 39 Age 46.7 ± 18.3; range (19–80) 52.3 ± 19.3; range (19–86)* Years of schooling 13.05 ± 3.8; range (6–20) 11.6 ± 3.3; range (7–20)* Reading fluency 135.1 ± 19.9; range (103–186) 127.9 ± 18.5; range (99–173)* Male 21 (41%) 16 (41%) Female 30 (59%) 23 (59%) Table 2 Demographic character- istics of the Control and Clinical groups * p > .05 1 3 173 A. L. Fernández et al. (n = 18). Handedness was distributed as follows: 80% right-handed, 18% ambidex- trous, and 2% left-handed. Participants of the Cl group were recruited from a neuropsychology clinical ser- vice. They underwent a full neuropsychological assessment that determined if they had a cognitive impairment in any cognitive domain. This evaluation comprised (1) a comprehensive neuropsychological test battery assessing memory, attention, con- structional praxis, verbal fluency (phonological and semantic), reading, writing, cal- culation, and executive functioning; and (2) a detailed review of participants’ medical history, including current symptoms, patient medical records, and present life circum- stances. In many cases, additional medical procedures such as neurological assess- ments, psychiatric assessments, electrophysiological studies (EEG), imaging studies (MRI, CT scan, or SPECT), and laboratory examinations were available for review. The educational level of the testees was determined according to their reading fluency ability (RF). RF was adopted as an index of the quality of the education level, and it has demonstrated a correlation with MUNS scores that is higher than that of number of years of education (Fernandez & Jauregui, 2021). The RF task consisted of a text that described the weather of a city (Córdoba, Argentina). The text, in Spanish, contained 215 words, separated into 5 paragraphs. It was extracted from a free content web page and was modified in order to achieve a neutral emotional tone. The text was presented in 12-point “Times New Roman” font on an A4 size sheet (see Table 3). Participants were asked to place the text at a comfortable reading distance and read it aloud at their usual reading pace. We then operationalized RF as the number of words read correctly per minute. Reading performance was audio recorded to accurately score for errors and reading time. Omissions, substitutions, insertions, and self-corrections were deemed as errors. The score was the number of words read correctly per minute. The following formula was used to obtain the score: (60 × (215 − errors))/total time in seconds. Participants who obtained an RF score below 95 were included in the LE group. This cut-off score was obtained through the analysis of the pilot sample that showed that 95 was the score that best matched RF and lower education (participants with less than 8 yearsof schooling). Due to the insufficient number of LE participants recruited up to date, the analyses presented in this article will be based only on the performance of HE participants. Procedure Administration settings and procedures were identical to the ones described in Study 1. The scoring system was modified with regard to Study 1 (see Table 4). All subse- quent statistical analyses were performed using the new scoring system. In Study 1, correct answers were given one point and incorrect answers were given zero points. After scoring each answer, an analysis was run in which the performance of both clinical and control groups on each item was compared. This procedure was followed in order to increase the difference in the scores between both groups, thus improving the sensitivity of the scale to differentiate them. Items in which the mean score of both groups differed significantly were awarded more points for each correct answer. These items were awarded three points for the Word List, Visual Memory and Dots and Lines subtests. For the Personage subtest, these special items were awarded five 1 3 174 The multicultural neuropsychological scale (MUNS): validity, reliability,… Ta bl e 3 R ea di ng F lu en cy te xt Sp an is h El c lim a de la c iu da d de C ór do ba , c om o el d e la m ay or p ar te d e la p ro vi nc ia , e s s ub tro pi ca l h úm ed o, m od er ad o po r l os v ie nt os fr ío s or ig in ad os e n la A nt ár tid a, q ue so pl an d es de e l c ua dr an te su r-o es te . H ay c ua tro e st ac io ne s m ar ca da s. Lo s v er an os q ue v an d es de fi na le s d e no vi em br e ha st a pr in ci pi os d e m ar zo tr ae n dí as c al ur os os c on fr ec ue nt es to rm en ta s e lé ct ric as . L as o la s d e ca lo r s on c om un es y tr ae n dí as c on te m pe ra tu ra s m uy a lta s. Si n em ba rg o, lo s v ie nt os d el su r s ie m pr e tra en a liv io c on to rm en ta s y u n dí a o do s d e cl im a fr es co . L as te m pe ra tu ra s n oc tu rn as p ue de n de sc en de r f ác ilm en te v ar io s gr ad os , p er o el c al or c om ie nz a a in cr em en ta rs e de in m ed ia to a l d ía si gu ie nt e. En e l o to ño e l c lim a ya e s s ig ni fic at iv am en te m ás se co y la te m pe ra tu ra d is m in uy e ge ne ra nd o co nd ic io ne s m uy a gr ad ab le s. El c lim a fr io d ur a de sd e fin al es d e m ay o ha st a pr in ci pi os d e se pt ie m br e. S in e m ba rg o, a v ec es , f ue rte s v ie nt os d el n or oe st e qu e de sc ie n- de n de sd e la s m on ta ña s p ue de n tra er a lg un os d ía s d e m uc ho c al or e n el m ed io d el in vi er no . La p rim av er a es e xt re m ad am en te v ar ia bl e y ve nt os a. E n es ta e st ac ió n pu ed en d ar se la rg os p er ío do s f re sc os y se co s c on n oc he s f ría s, se gu id os p or o la s d e in te ns o ca lo r y p or to rm en ta s f ue rte s, co n gr an iz o y vi en to . L a se qu ía e s c om ún e n es ta te m po ra da c ua nd o la s p re - ci pi ta ci on es d e ve ra no ll eg an m ás ta rd e de lo e sp er ad o. E ng lis h Th e cl im at e of th e ci ty o f C ór do ba , l ik e th at o f m os t o f t he p ro vi nc e, is h um id a nd su bt ro pi ca l. It is m od er at ed b y co ld w in ds o rig in at in g in A nt ar ct ic a th at b lo w fr om th e so ut h- w es te rn re gi on . Th er e ar e fo ur m ar ke d se as on s. Su m m er s r un fr om la te N ov em be r u nt il ea rly M ar ch , b rin gi ng h ot d ay s w ith fr eq ue nt th un de rs to rm s. H ea t w av es a re c om m on , b rin gi ng d ay s w ith v er y hi gh te m pe ra tu re s. H ow ev er , s ou th w in ds a re su re to b rin g re lie f w ith th un de rs to rm s an d a da y or tw o of c oo l, cr is p w ea th er . N ig ht tim e te m pe ra tu re s c an e as ily d ro p se ve ra l d eg re es , b ut th e he at b ui ld s u p ag ai n th e ne xt d ay . In th e fa ll, th e w ea th er is si gn ifi ca nt ly d rie r a nd th e te m pe ra tu re d ec re as es , c re at in g ve ry p le as an t c on di tio ns . Th e co ld w ea th er la st s f ro m la te M ay u nt il ea rly S ep te m be r. H ow ev er , s pr in g is e xt re m el y va ria bl e an d w in dy . T he re m ay b e lo ng st re tc he s o f c oo l d ry w ea th er w ith c ol d ni gh ts fo llo w ed b y in te ns e he at w av es a nd th un de rs to rm s w ith h ai l a nd h ig h w in ds . D ro ug ht is m os t c om m on in th is se as on , w he n th e no rm al su m m er ra in fa ll ar riv es la te r t ha n ex pe ct ed . 1 3 175 A. L. Fernández et al. points. For all subtests, in the items in which no significant difference was found, correct answers were awarded one point. For Arrows and Animals subtests, one point was given for every correct answer. The scoring system of the Party subtest is based on a combination of scores obtained according to the number of items purchased, the blocks traveled to complete the route, and the number of occasions in which the participant crosses the line over a block instead of using the streets. Data analysis A contrasted group method was performed. The performance of both groups was compared using ANOVA. The sensitivity and specificity indexes were also calculated. Study 3. Reliability Sample The sample included 71 healthy Argentinian participants recruited by snowball sampling. The recruiting procedure and exclusion criteria were identical to Study 1. The mean age was 27.35 ± 14.12, and the mean years of education was 14.87 ± 2.76. Education level was also estimated through RF, which ranged between 113.16 and 189.09 words read correctly per minute (M = 148.22 ± 17.56). Seventy-six percent of the subjects were female. There were no statistically significant differences between Subtest Old Scoring System New Scoring System Personage 1 point for correct answers; 0 point for incorrect answers 1 or 5 points for correct answers; 0 points for incor- rect answers Word list 1 point for correct answers; 0 point for incorrect answers 1 or 3 points for correct answers; 0 points for incor- rect answers Visual Memory 1 point for correct answers; 0 point for incorrect answers 1 or 3 points for correct answers; 0 points for incor- rect answers Arrows 1 point for correct answers; 0 point for incorrect answers 1 point for correct answers; 0 point for incorrect answers Animals 1 point for correct answers; 0 point for incorrect answers 1 point for correct answers; 0 point for incorrect answers Dots & lines 1 point for correct answers; 0 point for incorrect answers 1 or 3 points for correct answers; 0 points for incor- rect answers Party A combination of scores including: (a) the number of items purchased, (b) the blocks traveled to complete the route, (c) and amount saved in coins. A combination of scores including: (a) the number of items purchased, (b) the blocks traveled to complete the route, (c) the number of occasions in which the participant crosses the line over a block instead of using the streets. Table 4 Description of the MUNS old and new scoring system 1 3 176 The multicultural neuropsychological scale (MUNS): validity, reliability,…males and females. Handedness was distributed as follows: 87% right-handed, 10% left-handed, and 3% ambidextrous. Procedure Administration settings and procedures were identical to the ones described in Study 1. Test-retest interval was 32.90 ± 4.01 days, with a minimum of 28 and a maximum of 49 days. The scoring system was the same that was used in the validity study. Data analysis Reliable change was assessed using a multivariate version of the standardized regres- sion-based (SRB) methodology described by McSweeny et al. (1993) to predict the subjects’ MUNS retest total scores based on MUNS pretest total scores, age, years of schooling, and RF. This methodology was chosen due to its statistical advantages which address the phenomenon of regression towards the mean as well as the practice effects, and because it has been widely used in the neuropsychology field (Hammers & Duff, 2019; Hinton-Bayre, 2010). A retest z-score above or below ± 1.64 (i.e., a 90% confidence interval) was considered a significant change. Study 4. Normative study Sample One hundred and eighty-four healthy Argentinian participants were included in the sample. Participants were recruited from a variety of Argentinean cities and states. The recruiting procedure and exclusion criteria were identical to Study 1. Mean age was 32.22 ± 16.16. Age range was 15–80 years old. The mean years of education was 13.84 ± 3.22, and the education range was 4–20 years. Sixty-six per- cent of the sample were female. Regarding handedness, 86% were right-handed, 9% ambidextrous, and 5% left-handed. Procedure Administration settings and procedures were identical to the ones described in Study 1. The scoring system was the same that was used in the validity study. Data analysis For the normative data analysis, the regression-based method was followed. The mul- tiple linear regression model’s statistical assumptions of homoscedasticity, normal distribution of the residuals, absence of multicollinearity, and absence of influential cases, were tested. The Kolmogorov-Smirnov normality test was significant for the MUNS Total Scaled Score (p = .03), however, the Shapiro-Wilks was non-significant (p = .23). The corresponding asymmetry and kurtosis indexes were − 0.1 and − 0.11. As regards multicollinearity, the Variance Inflation Factors and condition indexes 1 3 177 A. L. Fernández et al. were computed. The following indexes were obtained for the former: 1.24 and 1; whereas 14.19 and 20.7 were obtained for the latter. They indicate null or mild mul- ticollinearity problems. The average Cook’s distance was 0.006, which indicates an absence of outliers influencing the model. Finally, homoscedasticity was tested. The correlation between standardized residuals and standardized predicted residuals was plotted. This diagram showed that residuals were homoscedastic across the scores. Following the method described by Testa et al. (2009), in order to normalize the score distribution, raw scores were transformed to scaled scores (mean = 10, SD = 3) based on the cumulative frequency of the normative sample. The resulting scores are exhibited in Table 5. Next, using the scaled scores as the dependent variable and age, reading fluency, and gender as predictive variables, a multiple regression analysis was performed. Gender was dummy coded with female = 1 and male = 2. An addi- tional regression analysis was performed using number of years of schooling instead of RF as an independent variable. This was made to allow the estimation of z scores in those situations in which the participant cannot read. Study 5. Cross-cultural comparison Sample Argentinian (n = 55) and U.S.A (n = 22) samples were administered the MUNS. There were no significant differences between samples in age, years of education and gen- der. Argentinians were administered the Spanish version of the MUNS while North Americans were administered the English version. Participants for whom Spanish (Argentina) or English (U.S.A) was not their first language were not included in the samples. Although the MUNS contains two versions depending on the education level of the participant (higher or lower) in this study only higher education partici- Raw Score Cumulative Percentiles Scaled Score > 389 100 18 374–389 99 17 369–373 97–98 16 363–368 94–96 15 350–362 89–93 14 336–349 81–88 13 321–335 69–80 12 313–320 57–68 11 290–312 44–56 10 277–289 31–43 9 256–276 21–30 8 231–255 13–20 7 213–230 7–12 6 195–212 4–6 5 189–194 2–3 4 170–188 1 3 143–169 0 2 Table 5 Equivalence for the Transformation of Raw Scores into Scaled Scores 1 3 178 The multicultural neuropsychological scale (MUNS): validity, reliability,… pants were included. The recruiting procedure and exclusion criteria were identical to Study 1. The samples were age and education matched. An ANOVA showed that there were no significant differences in age (p = .20) or number of years of schooling (p = .06) between both samples. Procedure Administration settings and procedures were identical to the ones described in Study 1. The scoring system was the same that was used in the validity and reliability stud- ies. As well as in the administration of the Spanish version, for the English ver- sion administration was performed by several undergraduate psychology students and graduate psychologists who were native English speakers. To ensure standard- ized administration and scoring, all administrators underwent training sessions with the test developers. After reviewing test administration instructions, testers practiced administering the instrument, first with other testers and then with individuals who matched the sample used in the final studies. For the English version, a forward translation procedure was followed. First, the Spanish version was translated into English by a professional translator. Next, the test developers checked this version with another professional translator and made cor- rections. Finally, native English-speaking psychologists, along with the test develop- ers, checked the second review and made final adjustments to ensure the appropriate inclusion of the terms inherent to the neuropsychology field. Data analysis An ANOVA was performed. The independent variable was the cultural group (Argen- tina vs. the U.S.A) and the dependent variables were the seven subtests scores and the total score of the MUNS. Results Study 1. Development of materials Overall, participants understood the instructions and were able to perform the tasks, even the elderly with very low education. Different ANOVA analysis, in which the educational level (higher-lower) was the independent variable, showed that the mean scores for each subtest for which the activities were different according to the educa- tion group were not significantly different across groups. Therefore, each version of the test represented the same level of difficulty for each group. Mean and standard deviations are exhibited in Table 6. The Shapiro–Wilk W-test for normality of the Total Score score was non-significant (p = .06), indicating that the distribution can be considered normal (Shapiro, 1965). However, the same normality test indicated that the Visual Memory-delayed, Arrows, Dots and Lines and Party subtests were negatively skewed. 1 3 179 A. L. Fernández et al. Age and education were not significantly correlated. Age had a significant cor- relation with immediate (-0.32) and delayed Visual Memory (-0.32), the Recogni- tion trial of the Word List (-0.28), the Delayed trial of the Word List (-0.27) and the Total Score (-0.24). Education was significantly correlated to the performance on the following subtests: Animals (0.59), Visual Memory immediate (0.49), Visual Memory delayed (0.37), Arrows (0.25), Dots and Lines (0.44) and Total Score (0.52). However, these correlations changed when lower and higher education groups were analyzed separately. In the lower education group(n = 31) education correlated sig- nificantly with Word List immediate (0.39), Word List delayed (0.41), Personage (0.45), Visual Memory immediate (0.44), Arrows (0.52), Dots and Lines (0.82), Party (0.40) and the Total Score (0.78). In the higher education group (n = 41) it corre- lated with Animals(0.48), Word List immediate (0.32), Visual Memory immediate (0.34), Visual Memory delayed (0.38), Arrows (0.46) and the Total Score (0.54). An ANOVA confirmed that there were no significant differences in the performance between males and females, F(1, 70) = 0.11, p = .75. Study 2. Validity Significant differences between both groups were found in all the subtests except for the Arrows and the Recognition Trial of the Word List. Table 7 shows these differ- ences. In agreement with these results, the MUNS Total Score is the result of the sum of the scores of the following subtests: Word List (trials 1–3), Word List (delayed trial), Personage, Animals, Visual Memory (immediate and delayed trials), and Party. The Dots and Lines score was excluded from the Total Score because of its extremely asymmetric score distribution. The MUNS Total Score should be interpreted as an index that reflects general cognitive functioning excluding attention, recognition and constructional praxis. The analysis of the contrasted group study demonstrated a statistically significant difference between the C and Cl groups on the MUNS Total Score. Figure 2 shows this difference. The eta-squared was 0.20, which can be considered a large effect size. Using the MUNS Total Score, the sensitivity and specificity indexes were obtained by applying the Youden Index (J) to determine the optimal cut-off score. This cut- off score was set at 282. The resulting sensitivity was 88%, whereas the specificity was 51%. With this cut-off score, the positive likelihood ratio, defined as the ratio between the probability of a positive test result given the presence of the disease and Word List (Trials 1–3) 58.8 ± 12 Word List Delayed Trial 62.3 ± 19.7 Visual Memory Inmediate 80.2 ± 9.8 Visual Memory Delayed 76.7 ± 11.5 Animals 43.9 ± 14 Arrows 78.5 ± 14.8 Party 79.3 ± 7.9 Dots and Lines 96.3 ± 13.8 Personage 52.7 ± 18 Total Score 628.7 ± 71.1 Table 6 Mean and standard deviations for the entire group on each subtest 1 3 180 The multicultural neuropsychological scale (MUNS): validity, reliability,… the probability of a positive test result given the absence of the disease, was 1.78. The negative likelihood ratio, defined as the ratio between the probability of a nega- tive test result given the presence of the disease and the probability of a negative test result given the absence of the disease, was 0.25 (see Fig. 3). Study 3. Reliability A multivariate regression analysis was performed to predict participants’ raw MUNS retest total scores based on raw MUNS pretest total scores, age, years of school- ing, and RF. The predictors contributing significantly were MUNS pretest total score (B = .51, p = .000) and age (B = -1.62, p = .000). The results of the regression (R2) indicated that the model explained 57% of the variance. F(4,66) = 24.13, p = .000, Fig. 2 Mean Scores on the Multi- cultural Neuropsychological Scale Total Score by Group C (n = 51) Cl (n = 39) p Word List (Trials 1–3) 51 ± 12.93 42.97 ± 12.61 0.00* Word List Delayed Trial 8.51 ± 2.85 6.61 ± 3.11 0.00* Word List Recognition 7.76 + 5.77 8.52 + 3.04 0.49 Personage 22.90 ± 9.89 17.15 ± 10.21 0.00* Animals 30.53 ± 8.08 23.92 ± 6.26 0.00* Visual Memory Inmediate 31.94 ± 8.63 26.30 ± 9.07 0.00* Arrows 24.70 ± 3.55 23.80 ± 3.61 0.23 Visual Memory Delayed 12.74 ± 5.28 10.23 ± 5.12 0.02* Party 117.53 ± 31,76 93.92 ± 41.93 0.00* Dots and Lines 183.45 ± 1.85 179.76 ± 8.61 0.00* MUNS Total Score 275.16 ± 49.75 221.13 ± 62.23 0.00* Table 7 Means and Standard Deviations for all MUNS Sub- tests and the MUNS Total Score *pA. L. Fernández et al. As in the previous analysis, gender did not contribute significantly to the predic- tion of the dependent variable. Additional multiple regression analyses were performed for each subtest score. The results of these analyses have been included in the Appendix. These analyses are provided to interpret the scores yielded by each single subtest. Study 5. Cross-cultural comparison The ANOVA did not show significant differences between the samples in all the scores except for the attention subtest. Table 10 shows the means and standard devia- tions of each group on all of the scores under analysis. The score range for the MUNS total score in the Argentinian sample was 216–340; whereas it was 203–330 for the American sample. Discussion The data reported in this article show that the MUNS achieves all the projected objec- tives. First, it uses universal stimuli that are familiar to most cultures in the world. The use of the Swadesh list for the verbal memory subtests makes it possible to Table 9 Multiple Regression Analysis Results for the Normative Data with Number of Years of Schooling as the Independent Variable β SE B SE t(180) p Intercept 7.9607 1.0579 7.5243 0.000 Age -0.3970 0.0631 -0.0732 0.0116 -6.2875 0.000 Gender 0.0052 0.0631 0.0331 0.3958 0.0836 0.933 Number of years of schooling 0.3384 0.0631 0.3140 0.0586 5.3585 0.000 R = .53, R² = 0.28, F(3, 180) = 23.733, pMoreover, frequently the individuals diagnosed with MCI revert to normal in subsequent assess- ments (Koepsell & Monsell, 2012). Therefore, the sensitivity and specificity of the MUNS were tested in populations whose cognitive functioning is similar to control populations. It is unsurprising to find a significant degree of overlap between both populations, with low-performing control participants falling under the cut-off score. What seems to be more valuable is that the scale correctly identifies most cognitively impaired participants even if they have a mild impairment. Reliability was also satisfactory. Age was found to be a significant predictor of the retest scores, i.e., the higher the age, the lower the gain in retest scores. This is probably related to the fact that as age increases, learning ability diminishes (Naveh-Benjamin, 2000). This presents an advantage of the SRB methods as when compared to the tra- ditional test-retest procedure, they capture the influence of demographic variables on the retest scores. One of the disadvantages of SRB methods is their calculation com- plexity. To address this challenge, a user-friendly Excel calculator for deriving SRB change indices for individuals was developed. This easy-to-use Excel spreadsheet allows the neuropsychologist practitioner to enter a participant’s age and MUNS total score to automatically calculate their SRB index. The supplementary calculator is available online at https://drive.google.com/drive/folders/1VorlbnPLENTWa8--SAH 5VSKuJC3cmkzX?usp=sharing, and it is free to use. As expected, normative data were influenced by age and education. The data reported here allow for the assessment of participants with a wide range of ages and education levels. The performance of participants can be assessed by including 1 3 186 https://drive.google.com/drive/folders/1VorlbnPLENTWa8--SAH5VSKuJC3cmkzX?usp=sharing https://drive.google.com/drive/folders/1VorlbnPLENTWa8--SAH5VSKuJC3cmkzX?usp=sharing The multicultural neuropsychological scale (MUNS): validity, reliability,… data on their RF or their number of years of schooling. In the first case, the analysis is based on an index of the quality of their education, whereas in the second case, it is based on a quantitative assessment of their education. The authors of this article recommend RF as the first option. However, in some cases, participants will not be able to read as a result of different conditions such as alexia, aphasia or poor vision. Moreover, the RF reported here was estimated for Spanish speakers. It is expected that RF varies across languages. It is important to highlight that the inclusion of a qualitative index of the education level as a variable in the development of normative data is an innovative procedure included in this article. To the extent of our knowledge there are no studies published to date in which normative data were developed based on an index of the quality of their education. Seventh, although not fully explored yet, some preliminary data indicates that when demographically matched samples from different cultures are compared, their performance does not differ significantly. Data from study 5 demonstrated that the MUNS scores did not show significant differences except for the Arrows subtest. The differences in this subtest probably respond to the psychometric problems showed by the Arrows subtest and not to cultural related issues. Although these results need fur- ther study because of the samples size (especially the American sample), they suggest that the MUNS could be applied in English-speaking environments. Further studies are needed to confirm its utility in other cultural settings. Eighth, the MUNS does not require complex materials or instructions. The MUNS package contains the manual, the booklet which contains most of the stimuli and instructions for the tester, and only five additional sheets. These features make the package portable and inexpensive. Moreover, the materials are available from the authors upon request. In addition, an effort was made to present the testees with ecologically valid stimuli. Many of the stimuli such as the Visual Memory pictures, the Party, Animals, and Personage tasks, were designed to resemble daily activities. This characteristic makes the stimuli more familiar for the testees, especially for those with lower education levels who may not be familiar with cognitive testing procedures. One of the limitations of the research reported in this article is the lack of data for the lower education population. One of the goals of the MUNS is to provide clinicians with an instrument that is prepared for the assessment of lower education participants. Future articles should report data on this population. Another limitation is the lack of discrimination in the attention subtest between the C and Cl groups. The data reported here show that the task resulted equally dif- ficult for both groups. For this reason, the score of this subtest was excluded from the MUNS Total Score. An improvement of the current subtest or the development of a new subtest should help resolve this shortcoming. It is imperative for the future direction of the MUNS to test the performance of samples recruited from different cultures. There is a need for translation of the materi- als to even more languages so they may be administered to demographically matched samples in other countries. One other possible future development is the testing of the reliability of clinical samples. The performance of clinical samples tends to be more variable than that 1 3 187 A. L. Fernández et al. of control samples over time. Therefore, the performance of some specific clinical groups on the MUNS should be tested, allowing for the collection of more accurate data on the variability of their scores. As demonstrated, the MUNS is a valid and reliable test for the assessment of cog- nitive impairment. In the present study, we obtained a significant normative sample using Argentinean participants representing a wide range of ages and education lev- els as well as both genders. The MUNS appears as a potentially useful screening test for the assessment of diverse cultural samples. Appendix Table A1 Multiple Regression Analysis Results for the Normative Data with Reading Fluency as the Inde- pendent Variable – Word List subtest – Sum Trials 1–3 β SE B SE t(181) p Intercept 10.7783 1.9209 5.6108 0.000 Age -0.2892 0.0783 -0.0527 0.0142 -3.6915 0.000 Reading Test 0.0414 0.0783 0.0060 0.0114 0.5296 0.597 R = .31, R² = 0.09, F(2, 181) = 9.581, p4.2312 1.8870 2.2422 0.026 1 3 188 The multicultural neuropsychological scale (MUNS): validity, reliability,… Table A5 Multiple Regression Analysis Results for the Normative Data with Reading Fluency as the Inde- pendent Variable – Personage subtest β SE B SE t(181) p Age -0.1653 0.0750 -0.0309 0.0140 -2.2030 0.028 Reading Test 0.3130 0.0750 0.0469 0.0112 4.1708 0.000 R = .41, R² = 0.17, F(2, 181) = 18.540, p