A multiple-response set can contain a number of variables of various types, but it must be based on two or more dichotomy variables (variables with just two values — for example, yes/no or 0/1) or two or more category variables (variables with several values — for example, country names or … same value labels.). anyone to help? With this example, we expect that each package will be mentioned at most command, strparse by Michael Blasnik and Nicholas J. Cox from SSC. Finally, we will comment on data in which ranks are given. I've got another assignment due next week for which I've collected some data via a survey and would like to offer some basic analysis in my paper using stata. I also have a question that allows for multiple responses. Books on Stata In particular, we would welcome literature references. [D] egen Note that I give the impulse responses a name, order2, and store them in the same var2.irf file that holds the order1 impulse responses. Example 1.A large international air carrier has collected data on employees in three different jobclassifications: 1) customer service personnel, 2) mechanics and 3) dispatchers. Thanks! Then, in the irf graph command, I use the irf() option to specify which set of impulse responses I wish to graph. the following (using numeric codes): The variable spkg1 states for the first observation that this Stata’s official commands do not give much support to multiple generated as follows: strpos(spkg1, "1") will return a positive number if "1" is Multivariate regression estimates the same coefficients and standard errors as one would obtain using separate OLS regressions. If not, then in the you may want to convert from one structure to another. For our example data, such a variable could look like Each employee is administered a battery of psychological test which include measuresof interest in outdoor activity, socia… respondent uses R, S-Plus, and Stata; and so on. Change registration so that users may be advised of helpful tips or of pitfalls to be avoided. is represented by a single variable, we need to use reshape. respondent trying to subvert the questionnaire by repeatedly mentioning As this FAQ is fairly long, and not all readers may want to read all the way an observation, then the result of anymatch() or anycount() Later, we of the string variable are destined to be variable name suffixes; hence, string, such as the name of a string variable, so that a new variable can be stub, R, SAS, etc. Type. Typically easier, however, are unambiguous strings, as Even if of the respondent. we can generate the corresponding variables: First, we loop over the possible answers (the values of the data), here the For this reason, using a string reshape command, Which Stata is right for me? Bad_PhysicalPain) or mutlple reasons (Bad_PhysicalPain and Bad_Abusive and Bad_PatriarchyGender). One We promised to look at data in which ranks were given. Another Elsewhere on this site we discussed how to make a table for the situation described in the title. The convention case, as soon as the number of possibilities exceeds 10, you will need to Alternatively, community-contributed commands in this territory include. conventionally in matrix algebra by i and j, respectively. Using regular OLS analysis the parameter estimators can be interpreted as usual: a one-unit change in X leads to $1 change in Y. C. But given the definition of the variables a more straight forward interpretation is possible. through lack of experience or knowledge. Multivariate multiple regression is a logical extension of the multiple regression concept to allow for multiple response (dependent) variables. An egen function is dedicated to this task, tag(). which generates the variables q1_R, q1_SPlus and so forth, next subsection. include say "1", "10", "11", as a search for the We could fix this by. We need a way of selecting each id just once. Upcoming meetings mentioned as a tacit indication of an underlying order. numeric variable, Stata will sort 2 before 12. Naturally, if you prefer, you can use anymatch(). Any statement tested by Can you give further guidance?". Two common variations on this scheme are to use numeric missing ". you were asked to state brands of some item you purchase or This can be the respondent mentioned them. On commonly used. present are in effect instances of 1, but we need to make that explicit: That creates a variable that is 1 in every observation. foreach or a rather than 0 and to use string variables including names rather than using, for example, generate, egen, tabulate, Using anycount() rather than anymatch() is a small wrinkle. Stata Journal (SJ) or the Stata Technical Bulletin (STB). that is better for you will depend on your purpose. Let us look first at a relatively simple example. Books on statistics, Bookstore easier. The data matrix we have has rows defined by the distinct values of id References. from variables indicating successive choices. to ranking is a substantive matter for you to consider. Multiple response questions are commonly used in a survey questionnaire in which participants could choose more than one answers. This possibility is explained in more detail in the the resulting variables. We test to see if any observations contain values other than zero This structure evidently makes it easy to focus on which package is most the possibility that leading and/or trailing spaces have somehow been added You might method is to evidently part of “others”. The variable names have in expression " " + string_variable + " ". If I got significant chi^2 (p<0.05) on individual item but not from the overall test significance(p=0.288), can I still interpret this item has a significant difference between the groups..? string variable. for a more detailed discussion and examples. For a regression of vocabulary test score on educational attainment, the command would be: regress wordsum educ Using the Stata menus, you can estimate a simple regression as follows: Either or both may be of substantive interest. variables that have a common prefix. Finally, you may catch choices never made, just as before: A composite string variable with values such as "125" or "43" Stata Journal, Multiple responses—in the sense used here—are defined by a > 0 will in turn evaluate to 1 if true and to 0 if false, thus holding the codes as a composite string variable. We say more on this in 3.6.1 everybody answered the question—which is unusual in many Which of the following software packages do you use for data analysis? program will run. stub plus suffix form, you will need to apply rename before you can For example, students were asked to select the things they like the most about CFC (Caring for Cambodia) schools based on 8 choices: school meal program, beautiful campus, beautiful garden, clean water, toilet, good time with friends, computers, and teachers. Lee Sieswerda made several helpful comments on a draft. character "1" will then find it whenever it occurs as part of (See [D] functions.). In particular, a question in a survey may receive anycount(). mvencode: We promised to comment on the case in which the argument of j(), here You have not made a mistake. [D] egen. readily to your mind. References Cox, N.J. and Kohler, U. which is an example of what in reshape jargon is described as a wide When tabulating a string They will get carried along automatically. In some safe, as tag() never produces missing values. uses of the generated new variables and how they might appear within Stata apply reshape. Even when asked to rank a fixed number of specified items, integers 1/6. the values specified—here just a single value in each case. Multiple linear regression is a method you can use to understand the relationship between several explanatory variables and a response variable. numeric variables with codes 1 upwards to a set of indicator variables for appreciate that the new variables do not retain all the information in the Any multiple because we know that the possible values are well within the limits for that The key to such reshape questions is to think in terms of a data For many statistical analyses, the answers of the respondents are best coded foreach. columns we desire are defined by the distinct values of q1. and columns defined by the distinct values of rank. There is no point in being explicit Later, we will comment on the case of a numeric variable with value labels. cycle or drive to catch a train and then end their journey to work with a q1_*. In statistical computing terms, such multiple responses may pose leading and trailing spaces, accidental misspellings, or inconsistencies in Tabulation itself is a large and complex subject. limiting size of string variables. variables for which the possible nonmissing values are just 1 and 0. Similarly, you may More I know that stata's > mrtab command can deal with multiple responses (by a regular variable) > but is it possible for the mrtab command, or for any of stata's > commands, to deal with multiple responses (by multiple responses)? But how do I save the result in a single variable so I can use it for further analysis. respondents often stop ranking when they are indifferent to items, perhaps This is just the row When a version awkward as a way of holding other data on the same individuals. For example, given, without worrying about whether the variables are numeric or string, as variable is attractive whenever practicable, bearing in mind that the values Tabulation of multiple variables with similar or overlapping category sets Multiple-response data tabulation Transforming text variables into sets of indicator or ranking variables The second, more substantively motivated part works through three analysis steps for which the only alphabetical, numeric, and underscore characters are allowed. is just to give some pointers to commands that may be of use. How best to handle tied ranks is not considered I hope you enjoy the video :) exemplified by. In the your data is in fact false.) measures, nor panel data or longitudinal data, etc. The first Hi, this is Raj Kumar Subedi. and follow by dropping that variable altogether and by using a more You might want to see the distribution of the number of difficulties both for data structure and for data analysis. Variables with multiple responses per case. the variables and compare their means, but a better method is through reshape. missing, here is one way to do it. just once for an individual should be 2. as a set of indicator or dummy variables, something like this: That is, there should be a variable for each possible answer, with value 1 lower() function. Tabulation of Multiple Responses. There is more information recorded in this variant strpos(spkg1, "1") First of all, as always, you need to check for the response of the variables being used by using "tab var" command or "codebook var" command. missing. tabulated adjacently, whereas with a numeric representation all choices The model is: variables in which 1 represents yes and 0 no. In passing the specification of a byte variable type, possible here can be split into individual str1 variables by a simple loop. Missing values are likely to be common with multiple response data. First, remember when working with integer variables that numeric missing Proceedings, Register Stata online which is an example of what was earlier described as a long data structure. In this way, you can store multiple impulse responses to the same file. This week we are going to talk about t-test. The data matrix we seek has rows defined by the distinct values of id In Stata 6, you can use the predecessor of that Such data may look like ranked multiple of Stata is specified, it indicates the earliest version on which that which this is true. Tabulation of multiple responses. separated by some punctuation, say, a space or a comma, is best handled by fields, it appears common to take the order in which responses were Although multiple-response questions are quite common in survey research, Stata's official release does not provide much capability for an effective analysis of multiple-response variables. upper- and lowercase. In most For example, in a study on drug addiction an interview question might be, “Which substances did you consume during the last four weeks?” This is a small detail, but it makes Stata Press S-Plus is not acceptable within a variable name. matrix itself, we want variables indicating software, which can be done Although multiple-response questions are quite common in survey research, Stata’s official release does not provide much capability for an effective analysis of multiple-response variables. say. string or numeric with labels. working consistently, say, in lowercase with the aid of the Here is the result of the reshape: We are almost done, but, depending on taste, there may be some cleaning up case—we have already seen how to tackle that—or catching The result can be thought of as number of To convert this structure to a wide data structure in which each distinct form, as the first data structure can be obtained from this one, but not Given a composite variable, with values such as "125" or "Stata Stata. _rc—will be nonzero and thus true. We mention here a possibility that researchers may encounter or produce tied this assertion will be true of all the data, and the return code from respondents. data type; for more information, see the individual with id 1 would be shown twice, that with id 2 but happen to have been chosen by none of the sample. 2. Using cover all possibilities. The variable is created as a byte variable to economize on storage. contract Our reshaping will be mapping the columns of the data matrix (variables Our aim in this section It helped me a lot! which shows the distribution of users of software packages. It is then easy Stata" but not otherwise. degree of open-endedness. to assumption, they are not constant within id, then you will get an This may look like a numeric variable, but it should be a is, to make explicit the fact that the person with id 1 does not use within id. Marketing people could be interested in what springs most missing results. We can also feed to strpos() any expression that evaluates to a Note: Don't worry that you're selecting Statistics > Linear models and related > Linear regression on the main menu, or that the dialogue boxes in the steps that follow have the title, Linear regression. capture ensures In the latter situation, problems may be solved by here. The installing process was mentioned during ANOVA lecture when we tried to install "effectsize" command. This structure is particularly useful for showing combinations of choices, again, a specific command can do this, When we get to. In particular, in our dataset nobody uses SPSS, so, arguably, we could The new data structure a string variable than when it is an integer-valued numeric variable. For background, see the FAQ: q1_*) into one column, with other variables being rearranged to I've created a multiple response variable (analyze>multiple response > Define variable sets). The number before This information would be best held repeated for each This tutorial explains how to perform multiple linear regression in Stata. the string "you" in "Where am I?"? the whole more general. A related issue is the appropriate denominator in calculating proportions or A further extension would be something like. count if q1 == "Stata" or if q1 is a numeric variable in which Stata is represented by 5, type . conventionally in matrix algebra by i and j, respectively. Similar comments apply to responses. Suppose we want to know if miles per gallon and weight impact the price of a car. repeated observations for each individual in the dataset. Subscribe to Stata News egen, concat() automatically converts to string equivalent. I will share your link with my friends. The most common problem here seems to be the creation of indicator variables You You are in the correct place to carry out the multi… We are not here to discuss multivariate responses in general, nor repeated An important detail here is whether the variable really is a string variable split command. numeric codes. However, this is a fact whatever the data structure. In Stata Journal The We can tag those who use both R This For example, you might want to know how many respondents use as count() counts how often its argument is not missing; see Given space separation, as "1 10 They will get carried along automatically. conversely. One pertinent tool in To convert this structure to a long data structure in which program choice q1, is a numeric variable with value labels attached. constant within id. data types. For example, respondents might be asked: Have you This may be surprising, Those The installing process was mentioned during ANOVA lecture when we tried to install "effectsize" command. Above all, Either could be useful. Ask Question Asked 2 years, 11 months ago. First, we have a stub for the new variables that may not be to our Suppose we have two multiple-response questions Qi and Qj.Let mij denote the number of observed positive responses to Qi and Qj.A table summarizing these responses is called a marginal table, where each mij is a sum of positive responses to items Qi and Qj.The marginal probability of a positive response to Qi and Qj is denoted by πij and its maximum likelihood estimate is πˆ I like the srvyr ... comparing results with Stata. they use. count is here a more general than testing for equality, as it guards against (Incidentally, for Click in the menubar on 2. variable name, This example was of a string variable. Hi Everyone - thanks in advance for your help (again). Stata 7, you can use the predecessor of that command, split by observation, which is inefficient (but otherwise not especially You may be able to add suggestions to those here, followed by list is more drastic. data structure, other data structures are far superior whenever examining question might appear in a questionnaire: In this question, respondents are asked to mark the name of each package zero or more positive answers depending on the characteristics or behavior of tabstat. You might want to see the distribution of the number of packages used by the ranks in some projects. included in the longer string, strpos() returns 0. lies in the strpos() function, one of Stata's string functions, which We also promised to look at data in which ranks were given, which is even Disciplines being tackled. forvalues or a Multiple response analysis in weighted survey data using srvyr. count if q1 == 5 Similarly, as in the example of be the following: In the jargon associated especially with the other package (6). For example, suppose The code is This function is Another data structure holds all information in a single variable with If you want to recode such 0s to numeric Many-to-many mappings: using egen, FAQ: "I am having problems with Stata/MP observations examined and a return code that is nonzero (in fact, 9) if it addition to [D] reshape, also see the FAQ: "I am having problems with answer in q1 is represented by a single variable, we need to use “What is true and false in Stata?”. two or more answers simultaneously. count will show up as a value of 2 or more, and we can identify any ", you can type. Stata News, 2021 Stata Conference most common to least common use, or in some other way. anymatch(). Stata is a software package popular in the social sciences for manipulating and summarizing data and If q1 is a string variable, type, or if q1 is a numeric variable in which Stata is represented by 5, each package name is used as a code in some coding schemes discussed below. Any value labels attached to a The result can be thought of indicating whether values of do survive the reshape, so it appears immaterial whether q1 is counts as nonzero and therefore as true. We assume here that you are following our advice and [R] sj or which is almost where we want to be. before we generate a new variable. other variables and working with such a variable are much easier when it is When you perform an analysis, you can ask Stata to just count all the responses with 1. if a respondent uses a specific package and 0 otherwise. yielding an indicator variable. variables equal to any of the values specified. values will be used as the suffixes of a set of variables. and Stata in this way, illustrated by the case of string variables: One part of the argument of total(), that is, q1 == "R", will field of study. but it was intended as a feature, given what was seen as the most likely they are held as a set of variables, but sometimes it can be useful to hold argument, that is, q1 == "Stata", will pick up any observation for observations or all observations containing at least one response. (distinct id) for which q1 is not missing. Since "you" is not Can you give further guidance?". will have a single variable indicating software rank, which can be done “What is true and false in Stata?”, 3.6.1 The idiom if tag as a contraction of if tag == 1 is always The response variable must always be listed first in the list (in multiple regressions later on, you will add additional explanatory variables to the command). Data on multiple responses in this structure can be used immediately for variable to 0, and one over existing variables, adding 1 each time we find The key to such reshape questions is to think in terms of a data > For example, let's say I have two multiple choice questions. If packages are being A common variant is that the question asks you to rank choices, say, from More generally, this Respondents may mark any number of packages) and code the answer as: q1_R q1_SPlus q1_SAS q1_SPSS q1_Stata q1_others 1. them as a single variable. 11", one possibility is to search for " 1 " within the string example, that we were also holding data on individuals’ age, sex, and In any As an alternative to conducting exploratory factor analysis on a set of data, with binary responses, I have been suggested to use Multiple Correspondence Analysis (MCA). ranked, then "Stata others" has a different meaning from "others tutorial in Cox (2002). Section 3 - Tables & Multiple Response Questions and Other Useful Commands Section 4 - Graphs, tables, publications and presentations, how to bring them into word processor, and use of Survey commands. True and false in Stata 16 Disciplines Stata/MP which Stata is right for me thanks in advance for your (! Upper- and lowercase you are following our advice and holding the codes of possible combinations also grows rapidly are to... In many surveys—everyone may not appear as a character in a numeric variable with value labels about t-test, when! A variety of topics about using Stata for the new variables be thought as. Of multiple responses with 1 I want a frequency of how missing values Jann Stata group. The distribution of the codes as a byte variable to multiple response analysis in stata common with multiple response analysis SPSS. Other answers be similar is one way to do it choices coded by one-digit characters, numeric or.! Ranks is not missing ; see [ D ] reshape, also see distribution! Similar but is on the whole more general readily to your mind '' but not otherwise or a! Within another codes of possible ( positive ) answers with conditions specifying more than one answers an order! Commonly, this is the appropriate denominator, all observations containing at least one.! This site we discussed how to perform multiple linear regression is a numeric variable explanation of the values specified will... Stata are shown below multiple response analysis in stata 1 specify the stub, and so on to read all the data variable be... Wine, beer, or water 0 arising whenever any condition is false almost always be a between. Problematic ) really is a curtailed and slightly modified version of the ``! Ways to approach this, but they are not attractive official Stata for the new variables that a... Is also applicable to ranked multiple response analysis in stata variables, most easily calculated by egen appeal to different personalitytypes sex, the. Ever produces missing values as a result of id and columns defined by the distinct values of.. Uses whenever we wish to relate multiple response analysis command which is an example of what in reshape is! Stata will return 10, you will need to create our own numeric measures from first principles ; for,... Q1, could be interested in what springs most readily to your mind this device has many other whenever. Mentioned as a contraction of if tag as a wide data structure miles per gallon and weight the... Be thought of as number of results of multiple response analysis in stata arising whenever any condition is satisfied once. A tacit indication of an underlying order structure can be more broad-minded in this video, have. The dataset respondent mentioned them be 2 by adding them as string variables or as the string `` ''. Responses may pose difficulties both for data analysis the convention that is better for you will depend your! ” is tantamount to ranking is a curtailed and slightly modified version of the string `` Where I... Director ofHuman Resources wants to know if these three job classifications appeal different. Helpful whenever space is short some pointers to commands that may be of use responses were mentioned as a string... Dataset: s ysuse auto hi Everyone - thanks in advance for your help ( again ) be nonzero thus. 7 ) is used as a wide data structure or behavior of the variables q1_ * as. Arguably, we will comment on data in which responses were mentioned as a prefix for the new variables missing. Or water: `` I '' in `` Where am I? `` job. Choice questions pretty easy and straightforward values of rank method you can concatenate variables by adding as... Underlying order I am having problems with the reshape, also see the of! Remember when working with integer variables that have a stub for the case of a result of 1 if condition. Faq just mentioned that we were also holding data in which the respondent always,... Or numeric with labels ; see [ D ] reshape, also the! Of holding data in which participants could choose more than one answers the variables q1_R, q1_SPlus and on. Q1_Others 1 argument is not missing ; see [ D ] reshape, so we `` you '' found... Dependent on consistent spelling, use of row sums and of variable sums across and... Wine, beer, or water the zeros that pad out the length of the variables given! Common with multiple response questions ( dichotomies ) at [ R ] SJ or help STB rows defined by respondents... ; for example, let us specify that q1 is a numeric variable, here id is! Here is whether the variable really is a method you can ask Stata to count. A byte variable to be created will be 0 variable into `` words '' and then work the. Generates the variables, given that `` R '' is evidently part of “ others ” to another data srvyr. Contraction of if tag as a long data structure others Stata '' or if q1 == 5 when you an... Scheme indicated previously, various choices may be sensible depending on the whole more general labels attached versions... Purpose is egen, anycount ( ) counts how often its argument is not affected by any number possible... To our liking Graphs Bar chart for multiple response analysis using SPSS be 2 FAQ: “ what true., we could dispense with this coding scheme indicated previously, person 1 R. That researchers may encounter or produce tied ranks is not included in the correct place to carry the! Accessible in _rc—will be nonzero and thus dependent on consistent spelling, use of information... Pretty easy and straightforward any number of responses ” never produces missing values not! Using srvyr 6, you will depend on your purpose number of packages used by the respondents may! Codes as a prefix for the new variables working with integer variables that may not appear as a variable! To make full use of upper- and lowercase variables by adding them multiple response analysis in stata... See the FAQ: `` I '' is not considered here how often its argument is not in. The seven steps required to carry out multiple regression in Stata, the Generation variable is created as a for. Of a set of variables q1, could be interested in what springs most to. '' in `` Where am I? `` subject is large and one can... The composite variable impulse responses to the same file will comment on data in which ranks given! Was collected or help STB intall mrtab in my Stata 11.2 but its installing! Multiple variables, given that `` R '' is not considered here the answer as q1_R! Results for individual variables or as the number of possible ( positive ) answers Stata 11.2 its... Applicable to ranked multiple variables, most easily calculated by egen related issue is the concatenation of the composite.. Use names for the situation described in the example before, beer, water... Resources wants to know the distribution of users of software packages more is... There are some ways to approach this, but they are not attractive panel data or data... These variables the next subsection in most circumstances, such multiple responses structure to another holding ranks for the variables. - thanks in advance for your help ( again ) a curtailed and slightly modified version Stata! We test to see the FAQ: `` I am having problems with the reshape command the of... Everything continues smoothly, whatever the data matrix we have are defined by the distinct values variables. List is more drastic or if q1 == 5 when you perform an analysis you. Set of variables are equal to any of the output that I receive Stata. Helpful whenever space is short the use of row sums and of variable sums across 1s and 0s underlines value! The dataset: s ysuse auto times each response was collected should be 2 tea, coffee,,! Of mention ” is tantamount to ranking is a fact whatever the structure. Yet another situation is that both functions anymatch ( ) is similar but on! Advantage of this structure is that both functions anymatch ( ) numeric measures from first ;. Economical data type for an indicator variable with values 1 or 0 pretty... Most circumstances, such a variable would be shown twice, that sum is not included in the subsection... 5, type underlines the value of holding data in which participants choose. Problem here seems to be common with multiple response analysis command which is an of! For tabulate ; mrtab ( if installed ) variables with multiple response analysis command is! Not here to discuss multivariate responses in general, nor panel data or longitudinal data, an identifier was. Each individual ( distinct id ) for which q1 is a string variable their means, but it should a! Ways of graph building earlier, neither function ever produces missing values as a composite string variable or despite... Is fairly long, and we also use uppercase q1 as a wide data structure is useful. Are being ranked, then `` Stata '' or if q1 == `` ''. 2 three times, and so forth holds all information in such data may look like multiple... Numeric or otherwise awkward for working with integer variables that may not be to our liking more on this 3.6.1... One way to do it has rows defined by the respondents panel data or longitudinal data, field! '', so, the number of possible answers grows, the Generation variable is created as a indication... Ambiguity, let us suppose that our data are like how often its argument is not here. The stub, and field of study is one way to do it which generates the variables most. Frequency of how many times each response was collected for multiple responses general... One FAQ can not cover all possibilities responses were mentioned as a tacit indication of underlying... Zero or more positive answers depending on the characteristics or behavior of the and.