Data analysis · Statistics

How do you find a Phd candidate who would be interested in analyzing personality profile data?

Erik Wittreich Founder / CEO of Hinted

February 10th, 2015

We've collected a lot of very interesting and intimate data on users but we don't know how best to analyze it.  We think it would be great for a Phd candidate to undertake as part of his/her dissertation, but don't know where to begin a search or how to suggest such a topic.  Is there a better way than cold-calling professors in a related field?  Thanks!

Shingai Samudzi

February 10th, 2015

The better way is to have a bit more respect for the PhD process, and not treat a PhD student like an intern. The assumption you are making is that academia essentially operates like the business world - offer an incentive and get an academic to perform labor for you. This may be relatively true for post-docs, professors, or people who have already gained their PhD and are trying to make a name for themselves, this does not apply to the students themselves. Outside of the advisor (and unless you have a very close relationship with the candidate, or they were involved in someway with the project already), you will have very little ability to affect their dissertation topic decision. More importantly, these students are also not stupid. You are asking them to embark on a 5,6, 7+ year journey for which they cannot get paid in order to deliver value for your business. Have you thought about in what way your business would actually advance the existing research of a department or faculty? How would the PhD translate this dissertation and research into a career (most PhD's end up as professors, at least initially, whose goal is to write papers)? Is your work academically significant or relevant? Or are you simply trying to get free labor from someone with a high level expertise? Your ask reads like a yes to the last question.

Dan Oblinger Founder at AnalyticsFire

February 13th, 2015


Your phrasing   "... lots of interesting data ...   don't know how to analyze ....  PhD candidate to look at it."
Really set of some red flags for me.  

Everyone with a huge tranche of data says   "we have all of this VALUABLE data but we don't know how to use it."
Then they imagine a scientist with a magic data wand will zap it, and the billions in value will pop out.

The problem with that story is that the data scientist will know many ways of processing the data, but if they do not intimately know what kinds of outcomes would be really valuable, and to whom, they will not know which way to go.
So no matter who you try to connect with, if you cant articulate the kinds of value propositions might exist in the data, the collaboration will likely be sterile.

I agree with one of the other posters, that If you have the money to higher a data scientist, this is probably your best bet.  They will have no other agenda, other than trying to uncover value in your data, and they will not need to perform novel research (which the PhD student will need to do.)

If that is not an option, and you think there is novel research to be done on your data, AND you are in a position to invest many MONTHS in teaching this scientist about your data and what is important to do with it, then you are in a position to go hunting.

I agree that professors are the easiest starting point, and they will be mature enough to be able to quickly think about your ideas for the data.

I was a program manager at DARPA where it was my job to get professors interested in my agendas.  The thing that I could offer (besides money) they held the greatest sway was DATA.

DATA is magic for the ML/AI researcher.  Often they will have ideas, but cannot easily test them out, because they don't have the data to do it.  If you offer easy access to easily processed data, AND you offer them the ability to publish at least some aspects of what they uncover with the data, THEN you have something that will turn heads.

Indeed if you try to go fishing to interest, I would LEAD wit the data.  write up a description of your dataset.  what are the rows and columns, and what kinds of outputs to do you think can be derived from the data.

If you describe data that could be bent to fit the agenda of some professor, THEY WILL TAKE NOTICE.
Once they do, it will be in their interest to collaborate with you.  Ideally you can cover 10% of their salary, and a 1/2 time appointment of one of their grad students.  They will jump at that, if they think it will further their research agenda, and you will have some freedom to aim the grad student's efforts, if they want that data.

and of course you can also try to turn the grad student to the dark side, and convince them to leave their PhD and join you.... but don't let the prof know of that nefarious agenda.  btw, cash will likely NOT work... you will need to offer equity and a dream of 'making it'     still grad students at strong research institutes often have that hack 80 hours a week mentality that you might want in a sweat equity kind of guy.

best luck,

P.S.   I am also an ML guy....  for me it would be about noticeable equity, and a problem in NLP probably focused on relational kinds of knowlegeknowledge

Shobhit Verma Ed Tech Test Prep

February 14th, 2015

Good job @Harpreet . Like your Pinterest style Blog design as well. 

Steven Schkolne Computer Scientist on a Mission

February 10th, 2015

HI well - having a PhD I know a bit what it's like to be a candidate. Many, many people in academia are looking for real problems. It is well known, by those in the ivory tower, that they are in a tower.

While it sounds great (free research!) to have a PhD candidate attack your particular problem, realize that they will be steered by the needs of the community they are contributing to, as well as the philosophy they are exploring in their dissertation, so there is a high chance they won't address your particular issue, but rather something related.

I do think the best thing to do is to start with professors. They are more likely to have the maturity (and interest) to manage an interface with industry. Also, without their buy-in (and - in most cases - funding) a student won't be able to work on it anyway. Their grants need to actually be well aligned with your problem.

From what you said -- and it is indeed quite brief -- I have a hunch that what you're looking for is a good data scientist, who you hire, and pay cash to analyze your data for you. There is a slim chance that your data actually requires advances to the field of data science, which is absolutely required for a PhD candidate to want to research it. 99% of data science problems don't require research, but instead a good experienced data scientist.

If you do believe there's a strong case for a research problem, I would indeed start by cold-calling, reaching out on LinkedIn, going through your network etc etc the same kinds of techniques you would use to find anyone in the business world you need. Since the attitudes and interests of academics are so different than those we see in industry, it would really help if you could recruit someone you know who is an academic to help with this, so that you can speak the same language. But, in general, most profs I know are dying for their work to have more relevance in the real world. Paint that path and a good relationship will begin to form.

Krithika Chandrasekar

February 10th, 2015

There are better ways than cold calling professors. A good candidate for your specific problem will be working in either the Ece/cs department. Each graduate program has a program manager who looks after the entire department's administation. For example, the Electrical and Computer Engineering department at UCSB handles all queries about prospective new projects for PhDs by sending an email to the graduate manager Val de Veyra. She sends out an email to the program mailing list. An interested student will then come up with a proposal in alignment with your requirements. Hope this helps!

Erik Molander Executive in Residence at ITEC at Boston University

February 10th, 2015

Hi Erik, Here are a couple of questions that will be asked by the Dissertation committee that you should be prepared to answer. Did the users give their consent to have this data collected? If they haven't then the student will not be allowed to use it. How was the data collected? They will be looking for any systematic bias in the data set. This may not be all bad if you excluded the irrelevant data. Has all the personally identifying data been removed? The university could be liable for release of personal data. Then ask yourself a couple of questions. How does your firm feel about the data being shared with the Dissertation committee and potentially with a wider group for peer review? If it is proprietary, then you do not want to share it. How does your firm feel about the PhD sharing the results with the world? The data and insights will have to be made public for the student to earn their PhD. If they generate some potentially commercially viable insights or algorithms it will be available for public scrutiny. Finally ask to whom this may be valuable? We have a PhD candidate in marketing that might find it useful if it is clearly related to consumer choice and perception of trust. This is a pretty narrow field of inquiry. If the data is really broad then PhD's in Public Health, Biostatistics, epidemiology might find it useful if there is some way to link your data set to health outcomes. If you could send me a brief description of the data set, I might be able to help you narrow down your search. Cheers, Erik Molander Erik Molander Executive-In-Residence Strategy and Innovation Department School of Management 143 Bay State Road Room 502 Boston, MA 02215 617-358-5864 _molander@bu.edu_ Logo*/entrepreneurship@BU/*

Benjamin Grosof CTO, CEO, Co-Founder at Coherent Knowledge Systems

February 10th, 2015

Hi Erik,
I used to be a professor at MIT Sloan and have lots of experience in research.  I recommend the following angles. 
Look on the web at descriptions of local/relevant university departments in comp sci and also B-schools (IT and marketing) and maybe social science (social psychology).  Survey their research areas and the faculty involved in them. Also find there their industrial partners programs. That may give you ideas on topics and collaboration modes as well as people pointers to contact. 
Also, talk to PhD graduate students who have a few years under their belt.  Their time is typically much less constrained than professors', and they can help give you the lay of the land.  You can find them by showing up to seminar or networking events in person.

Arpit Gupta CTO @ AA Creator|Mentor ML|IoT|Cloud|Analytics|SDDC

February 10th, 2015

Universities regularly have demo days and hack events. So could participate there with this problem.
You could try crowd-sourcing aka mechanical turk, try specific data science sites
Visualization, Clustering of your data is a good way to start R is relatively easy to begin with, if data is not in GB+ our old XLS can do a lot of analysis as well
Sometimes motivated stats, maths, CS undergrads could also do some exploratory analysis  

Steve Götz VP @ Frost Data Capital

February 10th, 2015

Hi Erik,

I have some former colleagues at Trinity College Dublin that specialize in personalization and user modeling.  If you're interested I can put you in touch with them and they can discuss your idea.


Chad Huemme Business Development| Operations| Strategy

February 10th, 2015

Erik, I have a phd contact in India looking to do contract work. Interested? Chad Huemme