Planned Missing Data Designs

(from a post I made on the Applied Linguistics Research Methods Facebook group page)
 
"What's on your mind, Dan?"
"Well thanks for asking, Facebook. I'm thinking about planned missing data, actually."
"Why would you plan to have missing data?"
"So I can get more people and/or items without boring anyone to death!"
I've just taken the plunge in a research design with planned missing data. The idea here is getting data on more items or people than you could reasonably expect if you required everyone to respond to all the items at every time point - instead, you as some of the people to respond to some of the items some of the times.
While the most advanced planned missing data designs are found in longitudinal, well-funded developmental studies (e.g., in health, education; see https://quantitudethepodcast.org/listen/ episode 17 for a nice discussion), it's possible to take advantage of planned missing data in simpler designs as well, like one-shot cross-sectional designs.
More concretely: What I'm collecting data on right now is L2 speech dimensions. I have speech samples from over 200 speakers. In this type of research, it is common to get at least 10 people to judge a few dimensions of each speech sample (e.g., comprehensibility, accentedness). Instead of asking/finding a small number of people to spend well over 2 hours listening and trying to consistently judge these dimensions (to be honest, the speech samples get a bit repetitive - all on the same task/topic), with a planned missing data design I can ask 40 or so people to spend 20-30 minutes judging around 30 samples. All it requires is for listeners to have a bit of overlap in the speech samples they judge. Out of this, I will get estimates of the speech dimensions for each sample, and as a bonus thanks to involving more listeners, I can also investigate potentially listener differences in judgments of these dimensions.
Has anyone else tried out planned missing designs and care to share? Or read any good applied linguistics papers featuring planned missingness? These kinds of designs are sort of common in assessment research, but may not be planned as research, per se, they arise from using operational test data with sparse rating designs (e.g., TOEFL writing, where each essay is scored by just a couple raters) or linked/equated test forms (i.e., two test forms with a handful of overlapping items).
"Why aren't you finishing up that manuscript you were working on, Dan?"
"...Facebook, what aren't you tracking? Jeez."

"Horizon your broadens" - Where narrow exam prep goes wrong

"...horizon your broadens..."

While scoring essays for an English exam, I came across the above phrase, and I couldn't help but chuckle. To be clear, I generally do not find it appropriate to laugh at the efforts of L2 speakers, especially those made in the midst of a high-stakes exam, but this attempt at broaden your horizons, with its nearly-Spoonerism form, caught me off-guard. After regrouping and scoring the essay after a careful read, and then scoring a couple dozen more essays, this multiword unit lexical error stuck with me. In part, I think, because I see broaden your horizons so often in writing on English exams.

Why does broaden your horizons show up so much? What was going through the mind of the horizon your broadens author? What experiences and practices led him to this unfortunately and unintentionally humorous, infelicitous production?

There's actually a lot to unpack and reflect on here! 

To answer the first question: I would bet quite a bit of money that broaden your horizons comes up quite frequently in exam prep materials and courses. Although it might sound cliche to many, it is nonetheless idiomatic (indicating potentially sophisticated lexical knowledge and control) and kind of has a broad functional usefulness, at least when it comes to largely opinion-based writing- you can shoehorn broaden your horizons into a list of advantages for whatever you might be arguing for across many topics (Should people travel? Should people participate in local government? Should students pick their own majors? Should teens work part time jobs? What is the value of learning languages? etc. - you can work it in almost anywhere).

There's a lot of exam prep out there that focuses on essay templates, including sets of generally useful transitions (e.g., on the other hand), and broadly useful and sophisticated-sounding vocabulary that can be used in a wide variety of contexts (e.g., broaden your horizons). While learning these things isn't necessarily a waste of time in the bigger picture (good writers can use common rhetorical patterns and idiomatic vocabulary well in many different situations), a narrow and exclusive focus on just the language needed to squeak past an essay exam cutscore on a good day is what both language testers and teachers should hope to avoid.

Answering the second and third questions is harder. Maybe the writer had been cramming vocabulary in preparation for their big exam and simply gotten the form of this chunk mixed up? I know from personal experience it can be very easy to get things flipped around in L2 vocabulary. I'd imagine that the time pressure and stress of the exam may have made it harder for the writer to catch their mistake. They could have received very poor instruction or little useful feedback on practice essays, where I am sure that they would have attempted this phrase at least once before.

Washback is hard to get right. I know the testers behind this exam put in efforts to promote broad test prep - working on developing broadly useful and robust language skills - over narrow prep. But so much that leads into the exam room and what happens after is beyond the influence, much less control, of testers.

A PhD candidate's job search in the field of Second Language Studies

During this past academic year, I went on the job market in the field of second language studies (broadly considered, including applied linguistics, second language acquisition/studies, TESOL, etc.). Now that the dust has settled, I thought it might be worthwhile to break down how my search went. As I understand it, the job market varies from year to year, so my experience may or may not be relatable to other former and future PhD students going on the market.

Before going on the market, I knew that my most desired outcome was in academia; although I did/do have some interest in working on the language testing industry, these positions are advertised and hired outside of academic timelines. I targeted tenure-track assistant professor positions relevant to second language studies (e.g., applied linguistics, TESOL) in North America. In total, I submitted applications to 15 jobs. I began seriously looking at job postings in September 2018 and submitted my first application early in October 2018. My last application was submitted at the end of January 2019. To summarize the search, I've put together the figure below (a Sankey diagram created with http://sankeymatic.com/). 

Breakdown of my job search.

I like these Sankey diagrams - in this case, they tell the story of my search quite well. Of my submitted applications, I heard back from 10 of them; this was pleasantly surprising. I think this is a relatively high rate, which I attribute to looking for positions where I was a good match. I didn't apply to positions where I didn't see a realistic shot, such as those asking for K-12 public school teaching experience and research expertise. Of the 10 phone interviews I did, I received 6 campus invites (i.e., second-round interviews). I was again pleasantly surprised here. I had some experience with academic interviews from my days as a university ESL/EAP instructor, so that helped. I also did some interview practice with my advisor and studied each program diligently.

From the initial campus visit invites, things got a little bit rough. Some of my difficulties were related to being overseas in South Korea: 2 programs would not fund international travel for final-round candidates. It was gut-wrenching to do, but with a family to take care of and not knowing how long my job search could go, I couldn't personally justify spending thousands of dollars on chances at jobs, especially when I had chances at others that wouldn't sting me financially.

Luckily, I was able to complete 3 interviews, all of which were more pleasant and a little bit less stressful than I had imagined, resulting in one offer at a teaching-focused program in a really nice location (I'll have more details as I get closer to starting the job).

This offer necessitated cancelling another campus interview, which I felt bad about (they were very pleasant in the phone interview and willing to accommodate my travel). It also required me to withdraw from a search that started in January for a position at a great program that aligned with my profile.

All in all, I certainly can't complain about my experience: 15 applications is probably around or a little below average these days, and at the end of it I ended up with a job at a quality institution with the possibility of staying around for a long time (assuming I can earn tenure!). But things definitely felt competitive. Despite doing fairly well for myself in grad school, I knew I was competing with lots of other people who did well in grad school, and some people who were 1+ year out of grad school with commensurate teaching experience and publications. I know that 2 of the positions I applied for have been filled by (excellent) people coming off of visiting assistant professorships, and I wouldn't be surprised if a few more end up that way.


Research Tools: Automatic Interview Transcription



Hi all, I'm writing this post to share some resources for automatic transcription - very handy for interview data (but probably not conversation analysis). I think this is a topic that has come up in the past, but some of the tools I found recently were new (to me), so I thought to share.

Context: I've done about 30 hours of interviews; interviews were done in either English (My L1, interviewee L2+) or Korean (my L2, interviewee L2). Transcribing the English interviews by hand wasn't terribly slow, but it was tedious. Transcribing the Korean interviews was rough (okay, brutal) - my proficiency and typing speed slowed me down a lot. So I got to searching for additional help.

Where things stand with speech-to-text, you can't expect perfect transcriptions, but you can get fairly decent accuracy when speaker proficiency is high and not strongly accented. These can form a basis for manual clean-up, which is proving to be a lot quicker for me than starting from scratch. Here are a few of the tools I tried:

temi.com - This site only does English, but the accuracy seems quite good (even for non-native speakers). Lots of bells and whistles- there is a very nice interactive transcription editor. The rates are quite cheap, too: 10 cents per minute.

happyscribe.co - This site handles many different languages and has a lot of bells and whistles - automatic punctuation, automatically splitting speaker turns, highlighting parts of transcriptions that the system was less sure of, and really nice integrated playback (you can click a part of the transcript and the audio plays from there). I found the speaker segmentation to be a little off, and the transcription accuracy for this one seems a bit low for Korean (could just be the speakers I fed it). Hourly rates are between 9 and 12 Euro per hour (depending on whether you have a monthly membership).

vocalmatic.com - Handles a fairly wide range of languages, and I found it was decently accurate for Korean. This is pretty no-frills compared to the other options- no automated speaker separation, no punctuation, limited editing tools - but it turns out transcripts with helpful timestamps. I was able to get a lot of mileage out of their free trial (they say you get 30 minutes free, but....). Otherwise, rates are similar to happyscribe.

Other tools to look into are Trint, Transcribe (wreally.transcribe.com). I was focused on tools that offered Korean transcription, but European languages are more common across other platforms.

Difficulty of Korean Phoneme Production and Perception for L2 Learners

It's a new year, and a (renewed) resolution not to let this blog fall completely by the wayside... so, time to start sharing little slices of my dissertation research!

Very briefly, my dissertation is a language assessment project related to the pronunciation of L2 Korean phonemes. I developed a diagnostic test that provides information on how well learners can produce and perceive Korean phonemes. I collected test data from 198 adult learners from a range of first language backgrounds and overall proficiency levels.

After converting each phoneme score to percentages and averaging across learners, here's what I found for the accuracy of production (y axis) and perception (x axis) (graphs made in R with ggplot2):

For those of you who don't read Korean, but do read the International Phonetic Alphabet, here's another version of the chart:


In many ways, these results aren't so surprising. For example, the tensed consonants of Korean were the most difficult to produce. And in general, there's a fairly strong (moderate) correlation between production and perception accuracy (of course, this is averaged across learners, so should be taken with a grain of salt).

But there are some interesting things going on. For one, learners actually did reasonably well with perceiving some of the tensed sounds, like /d*/ (about 80% accurate). There were also some cases in which production accuracy exceeded perception accuracy; this is something that goes against some stronger versions of L2 speech learning theory, but I'd chalk much of it up to differences in task difficulty. Because the production items were scored according to criteria in line with John Levis' (2005) Intelligibility Principle, productions that were confidently identifiable were scored as correct, even if they weren't exactly within native-like ranges in terms of temporal/acoustic qualities. Reception items, however, were simply right or wrong based on a choice, and these items utilized native-speaker recording of standard Korean phonological contrasts. 

Reference:

Levis, J. (2005). Changing contexts and shifting paradigms in pronunciation teaching. TESOL Quarterly, 39, 39-377. https://doi.org/10.2307/3588485

A few reflections on being back in the language classroom again- as a student

This summer, I had the great privilege of spending 200 hours over the course of 10 weeks in an intensive Korean program in Seoul (thanks to Michigan State University Asian Studies Center's Foreign Language and Area Studies summer fellowship program!). Ever since I started learning Korean at the age of ~23, I have craved opportunities to devote (most of) my attention to learning Korean, rather than squeeze in bits of self-study and chats with my wife or friends here and there.

Aside from noticeably improving my knowledge of and ability in Korean, I also find that being a full-time(-ish, more on this later) student is great for reflecting on language teaching and learning. Here are a few reflections that have stuck with me:

  • Obligation is a Good Thing: Maybe one of the biggest things I get out of language classes is the obligation to show up, participate, do homework, etc. Outside of language classes, it is incredibly easy to just not use the language, and this can be true whether you are in an immersion setting or in foreign language setting. Hell, I'm married to a speaker of my target language, but because she speaks English incredibly well, it affords me the opportunity to be lazy. And I take that opportunity much more than I should. This summer I had to schlep 1.5 hours each way across Seoul for my 5-days-a-week classes, during morning rush hour. It was not always a fun commute, but I would do it again in a heartbeat. Having an obligation to learn and use the language is key. In my research on online language learning platforms, this is often the biggest missing element.
  • I am 99.99999% in favor of Target Language Only classrooms: With all due respect to translanguaging/plurilingualism/plurilingual repertoires/etc., in a language-focused classroom there's nothing quite like near-exclusive use of the target language. Having to struggle through your misunderstandings and botched expression pushes you to recall words and structures as you re-work things. Asking for help or clarification in the TL in the classroom prepares you to do the same outside of the classroom. I relished being the only L1 English speaker in my class this summer. There were multiple Vietnamese and Chinese speakers, who all made great efforts to stick to Korean during class, which I really appreciated. Occasionally, though, there would be a side-huddle among some shared L1 speakers... which was a little jarring, honestly. Everyone else just had to sit and wait! Luckily my classmates were very kind about reporting back to the group in Korean. 
  • Bilingual Dictionaries: Oh, right, that .00001%? I fully support individual use of bilingual (electronic/web) dictionaries. Especially as vocabulary gets more advanced and abstract. It's just such a huge time saver and I found my understanding of Korean words was better after a few seconds looking at definitions and examples in the KOR-ENG Naver dictionary. In some cases, teacher explanations led to confusions for me, likely due to my own deficiencies in lexical knowledge. 
  • Meaningful Communication - 말처럼 쉽지 않다 ("Not as easy as saying it"): The program I attended prides itself on it's communicative focus, and in many ways, they get it right. For example, we had an assignment where were gave a little presentation on a news article and then presented three topically related questions for whole-class discussion. This was probably my favorite week in class- we spent so much time talking about current events in Korea and elsewhere that we were genuinely interested in, and using grammar and vocabulary we had learned came naturally when we needed it. On the other hand, I think it is still common pedagogical practice to also spend a fair amount of time doing the whole "go around the room and have everyone make a sentence using the target grammar structure." This does involve speaking, but it's not really meaningful or contextualized communication. It also means no one else is talking!
  • Language students can have a lot on their plates: I hinted at this previously, but even though you might be enrolled full-time in an intensive language program, you might end up having a lot of other responsibilities to deal with. In my case, I had some work scoring essays and teaching an online class. I also became a father! Some of my classmates were putting in lots of hours at part-time jobs to support themselves. This kind of stuff makes it hard to do homework/projects as well as you'd like, and can negatively impact your sleep schedule, too. Some students do kind of get to "live the dream" where they have lots of time to study class material and really take advantage of immersion/social opportunities, but not everyone gets to do that. This really made me think about the expectations we have for international students in intensive English programs back home. Many are young and on generous scholarships, but not all of them. On the teaching side of things, I think that means we should really do our best to take advantage of class time, such as creating opportunities to write or work on presentations in class (with teacher and peer support!), and perhaps not assign too much weight to "busywork" like workbook pages that are so easy to just assign as homework.
Well, I could probably go on and on, but I'll stop here. These were some of the bigger thoughts floating around in my head since finishing the class last week. Even though a little negativity slipped in to those reflections, it was overall a really positive experience and I'm already looking forward to my next chance to be a language student.

Freely Accessible L2 Research

Unfortunately, access to good L2 research isn't easy. Many, if not most, of the top L2 research journals are behind steep paywalls. Books range from somewhat reasonable (say, $30 for a paperback) to astronomically expensive (hardback encyclopedias costing several hundred dollars). Researchers typically rely on their university libraries to pay for access to these materials, and occasionally pay out of pocket for maybe a few books a year that they'd like to keep on their bookshelf. For practicing teachers, independent researchers, and researchers working in contexts with limited resources, these costs are too much to bear. However, there are a number of good sources of freely-accessible L2 research.

This post, a continual work in progress, is my compilation of what I think are really good, free L2 research. If you have any suggestions for additions, I'd love to hear from you (though I ultimately reserve the right to curate for quality!).

Journals
White Papers/Research Reports
Working Papers
Working papers are like journals, but the overall caliber of work is a bit lower. These are often initial research ideas or preliminary reports by graduate students and sometimes faculty. So although that should be kept in mind when reading, working papers can still provide interesting ideas, reliable reports of research, and adequate topical syntheses.

Speech Intelligibility and Grain Size - a SLRF 2017 preview post

Next month, I'll be giving a talk called "Explaining Intelligibility: What matters most in L2 Speech?" at Second Language Research Forum 2017 in Columbus, Ohio. That talk will examine features of L2 Korean speech that caused intelligibility issues, based on data from 30 Korean native speaking listeners. This post is a preview, where I'll show some of my initial summary data.

In second language speech, a handful of constructs are widely studied and considered important: intelligibility, comprehensibility, accentedness, and fluency. Arguably, intelligibility is the most important, as you can't really have successful communication without it. When we think about intelligibility, we might think of it holistically to describe a person's general ability or a person's performance in some speaking context. Language tests are a good example of this- the word "intelligibility" pops up in rubrics that are used to assess someone's speaking performance on a test task. Visually, we might think of people having different degrees of intelligibility looking like this:

Fig 1. Average proportion of eojeols (words) in a picture description task correctly transcribed by 30 Korean listeners. 
In Fig. 1, we do see some variation- around 80% or so of Speaker B's words (actually 어절, eojeol, a word + bound morphemes, the preferred unit of analysis in Korean linguistics) were intelligible to Korean listeners, on average, while speaker F clocked in at around 50%.  But it isn't necessarily the case that Speaker B is always 30% more intelligible than Speaker F- everybody stumbles sometimes, right? What if we look at each utterance (sentence) that the speakers produced?
Fig 2. Average proportion of eojeols correctly transcribed in each utterance.
We can see here that Speaker F, while generally having troubles with intelligibility, really dropped the ball on his/her first sentence, which was almost completely unintelligible to the listeners in the study. Speaker B is relatively intelligible throughout, but his/her first sentence was a little harder to grasp compared to the following two. Speaker A shows one of the starkest contrasts, with his/her first sentence around 50% and the rest being 80% or so. It's worth noting that the person and sentence level is about as fine-grained as a lot of L2 intelligibility research using naturalistic or contextualized speech has gone. Some studies focused on single-word intelligibility (i.e., a learner reads single words, or names single objects) do get to the word level, but I am curious about what leads to intelligibility issues in more realistic contexts. After all, these utterances aren't uniformly 50% intelligible- each word is either intelligible, or not. So we can dial in here and look at things this way:

Fig 3. Proportions of correct transcriptions for each eojeol.
To me, this is where things get really interesting. For one, we can see much more variation- there's more red and orange in this plot compared to the utterance-level depiction in Fig 2. Some words were almost completely unintelligible to listeners. Those who read Korean might notice that many of these words are names! This is interesting, and was intentional in the task design for the speakers- a name that involves a nasal assimilation at the meeting of its two syllables was chosen. But other words, often involving times and days of the week, were also quite difficult for listeners to understand. What I'm more interested in, though, is the speech features that might cause these words to be unintelligible- is it phoneme substitutions? Deletions? Pauses or repetitions in the utterance? Lexical errors? Grammatical errors? And that's my next task- building models to examine the relative impacts of these (and other) features on intelligibility.

Stay tuned!

P.S. - It's also worth pointing out that the 30 listeners were not monolithic in their overall ability to understand and correctly transcribe words. I'll be looking at listener factors in another analysis at a later time, but here's a little preview of that:

Fig 4. Proportion of eojeols correctly transcribed by each listener.

Using Qualtrics and Soundcloud to do speech/listening research online

Intro

The internet is making it easier and easier to conduct L2 research. It's really easy to send out questionnaires to language learners or teachers, for example. It's also a great tool for collecting writing samples, or having learners read short texts and answer questions. Importantly, the internet makes finding and interacting with participants much easier than doing everything in person. Although you don't get the same level of control over experimental conditions that are valuable for some kinds of research, I'd argue that the internet makes it much more feasible to get non-undergraduate participants at low or no cost. Essentially, you trade experimental control for better sampling (than you could get otherwise).

One thing that isn't really easy to do on the internet is responding to or recording audio. Getting audio samples from participants over the internet in response to some kind of stimulus is pretty much a nightmare, and one that I would love to see solved (honestly, it seems like internet-based test providers are the only ones with a solid handle on this, in terms of out-of-the-box solutions). However, it's becoming a bit easier to get audio samples to participants, and as long as they're responding with clicks or typing, we're in business.

In this post, I'll show you how I've used Qualtrics and Soundcloud to collect transcriptions for a speech intelligibility project I am currently working on. In this project, I had 30 native speakers of Korean (about half located in Korea, and the others spread around the globe) transcribe 28 utterances produced by L2 Korean learners.

Qualtrics

 Qualtrics is a well-known online survey platform. There's a good chance your institution has deluxe access to it that will allow you to use most available functions AND dress up your surveys with a nice, official-looking stylesheet. If not, you can still use Qualtrics for free as an individual; the free version will let you do many kinds of simple surveys.

For this post, I'm going to assume some basic familiarity with Qualtrics. If you're totally new to it, head over to support.qualtrics.com to read up on the basics (and honestly, it's pretty intuitive- just make an account and start playing around to get a feel for it).

Qualtrics does have built-in audio/video support, letting you upload files to your account and embedding them in survey questions. Going this route, you can very easily implement something like this:


 

To do this, create a new Text/Graphic question, type your direction (e.g., "Click play to listen.") and then click the Rich Content Editor... tab. Then, click the little film icon to upload and insert a media file.



After you have your audio file loaded in, create a new Text Entry question below- this is where a listener can type what they heard. You can also add any other type of question, or multiple questions. This is a nice, simple solution, but there are some potential problems. For one, the audio player is fully controllable by the participant. This means that a participant could listen multiple times, and that participants may all have different numbers of repeated listenings. This is a major trade-off in control! Along the same lines, participants can play/pause at will, and scrub back and forth. For some research tasks, this might be fine (say, if you're just farming out audio corpus transcription). But if you want to measure someone's listening comprehension or the intelligibility of a particular speaker/utterance, well, this doesn't give us enough experimental control to confidently do that.

Another weakness of the Qualtrics media player is that it uses Flash. Flash is a fairly common piece of web software (though waning in popularity and use). In my experience it's not quite ubiquitous and universal- every time you see that broken media link with a little puzzle piece on a webpage, it's because your version of Flash isn't up to date or is for some reason incompatible with what the page is trying to show you. As a researcher, you don't want to lose potential participants because they can't play your stimuli.

Embedded streaming audio with Soundcloud

One workaround is embedding streaming media from an external site. Soundcloud uses HTML5 to stream audio. HTML5 is a nearly universal standard on the web, and almost all contemporary browsers handle it well. And as I'll show you, you can customize how embedded Soundcloud audio displays on your Qualtrics survey to increase your level of experimental control.

Get yourself a Soundcloud account (you get free audio storage for about 2 to 3 hours worth of stuff), and upload a file. When you upload, make sure to set your file to Private if you don't want it to be accessible to just anyone and everyone on the internet. After uploading, click the Share button by your file and then click over to the Embed tab.



 If you want to be able to limit participants' control over audio playback, you'll want to click More Options near the bottom of the embed tab and click the checkbox for Enable automatic play. This might seem like a bad idea, but we'll build our own means of advancing through the survey that won't startle your participants with audio unexpectedly playing.


Finally, click in the Code and preview box to highlight your embed code. Hit ctrl+c to copy the code. Back to Qualtrics!









Over in Qualtrics, we're first going to make a question that allows participants to be ready to hear an audio stimulus. Create a new Descriptive Text question, and type a direction like "Click the >> button to play the next audio file." Next, insert a Page Break to require the participant to manually advance to the next question.

Now, create a new Text Entry question (or Multiple Choice, or whatever you'd like). Click on the HTML tab, and paste (ctrl-v) the embed code from Soundcloud. To get rid of any audio controls, change the value after iframe width to 0%, and change the value after height to 0. If you look carefully at the rest of the embed code, you can see that auto_play is set to true (you'll also notice some blurry stuff in the screenshot below- just keeping my private file private!). For the other options, you could go through and set them all to false, but this is ultimately unnecessary- since you made the Soundcloud embed a 0x0 pixel box, there's nothing that anyone can click on in your survey.



In final form, we get:
1. A screen that gets the participant ready to listen

 2. A screen where audio automatically starts playing, and a question that a participant can answer.


Closing 

I hope this post has been helpful. By inserting an automatically playing, one-time-only streaming audio file into Qualtrics, you can have a wide range of participants respond to audio stimuli while still maintaining control over the number of replays and play/pause functions. With some finesse, you could even add a limited number of chances for participants to replay audio (Qualtrics features some fairly robust logic and sequencing options).

Also, I strongly recommend including a practice item or two that serve as audio hardware checks so that participants can make sure their speakers/headphones are a) on, and b) at a comfortable listening volume. In my experiment, I had them transcribe a speaker saying "I can hear the audio well" as a technical check before going on to a practice item.

While you might not be able to achieve the level of control necessary for any speech or listening research, I think we're getting to the point where a lot can be accomplished with pre-packaged data collection software/platforms... if you're willing to do just a little tweaking.