Software Engineering Leadership

How do you properly assess an open CV project?

Armand Sepulveda Co-Founder & CEO at Dycap Media Solutions, Inc

August 12th, 2015

How does one estimate the length and difficulty of a software project, specifically with Computer Vision?

My team and I are starting on the facial tracking aspect of our product. We produce a camera mount that enables infrared tracking of 3rd party cameras. So in online education the teacher would wear a lanyard and the camera mount would follow his movements. Now we want to explore the facial tracking possibilities so the subject doesn't have to wear any tracking mechanism. We decided computer vision would be the best option to develop in, but our tech team is just learning the fundamentals and can't give me an estimate on time for deadlines.

The CV code would need to track a face and have those commands interface with the motors, eventually we would want to include hand gesture recognition. Any approximate idea how long this can take? I want to make sure this is something we are not underestimating.

Eric Saund Research Scientist who builds stuff

August 12th, 2015

In my opinion, your project will be more daunting than you seem to realize now. OpenCV will provide many useful tools but, like carpentry, having a great set of tools does not make you a master carpenter.  To jump into this with a team that has zero computer vision experience is not a good idea.

A project involving machine perception (vision, speech, document recognition, etc.) will always have a difficulty/accuracy tradeoff.   It will be relatively easy to get OpenCV to track a face in frontal view at a distance of not greater than 20 pixels across in an image from a stationary camera.  Linking this to a PTZ camera to keep the face centered is simple geometry and trigonometry---in theory---but making it work in practice is more painful than you think going in. Once the teacher turns sideways and starts wandering around, once you need to zoom back to cover a larger field of view, etc., etc., performance will degrade. So all of a sudden you are faced with sorting through dropouts, false positives, search to re-acquire the target that you think is the teacher, and on and on. Your office or lab setup will mislead you into thinking you are farther along than you really are.  It will be difficult to anticipate all the variabiltiy you will encounter in real classrooms.  In many modern classrooms the teacher does not stand at the front of the room and lecture, they move around among the students and the students get up and work on the board or in groups.  So the face detector could find many faces any or none of which could be the teacher.

Over the years I have seen about a half dozen projects/companies/apps come and go that involve tracking teachers in classrooms with a pan/tilt/zoom camera.   One of the earlier ones was a PTZ camera by Sony from about 1996 that had this function built in.  They were tracking a color mixture blob, as I recall.    If you don't know the history of other products and technologies in this space that came before you, then you may be repeating a history of unsatisfactory outcomes.


August 12th, 2015

". Now we want to explore the facial tracking possibilities so the subject doesn't have to wear any tracking mechanism. We decided computer vision "

- what's wrong with the lanyard approach?
- why won't subjects wear something?
- why CV? what are the alternatives?

As Eric highlighted, CV is a pursuit littered with the failed efforts of some formidable teams. It is not for the faint of heart or light of wallet. If you do take on the CV approach, I'd love to hear about your experiences.