How to Detect Speaker from facial landmarks of mouth using face_recognition

asked 2019-11-15 04:16:28 -0600

updated 2019-11-15 04:39:47 -0600

berak
32993 ●7 ●81 ●312

I am trying to find a speaker from a webcam using facial land marks which i can get using the face_recognition library. I am successful in getting the month top lip and bottom lip points.

image description

I want to calculate the distance b/w these points and according to distance may be we can say person is speaking or not. What i had done so far now.

import face_recognition
import cv2
import math


video_capture = cv2.VideoCapture(0)
while True:
    # Grab a single frame of video
    ret, frame = video_capture.read()

    face_landmarks = face_recognition.face_landmarks(frame)
    try:
        p1=face_landmarks[0]['top_lip']
        p2=face_landmarks[0]['bottom_lip']
        x1,y1=p1[9]
        x3,y3=p1[8]
        x4,y4=p1[10]
        x2,y2=p2[9]
        x5,y5=p2[8]
        x6,y6=p2[10]
        dist = math.sqrt(((x2+x5+x6) - (x1+x3+x4)) ** 2 + ((y2+y5+y6) - (y1+y3+y4)) ** 2)
        print(dist)
        image = cv2.circle(frame, p1[8], 1, (255, 255, 255, 0), 2)
        image = cv2.circle(frame, p1[9], 1, (255, 255, 255, 0), 2)
        image = cv2.circle(frame, p1[10], 1, (255, 255, 255, 0), 2)

        image = cv2.circle(frame, p2[8], 1, (255, 255, 255, 0), 2)
        image = cv2.circle(frame, p2[9], 1, (255, 255, 255, 0), 2)
        image = cv2.circle(frame, p2[10], 1, (255, 255, 255, 0), 2)
        # # cv2.clipLine(frame, p1, p2,(255,255,255,0), thickness=2)
        # for p1t in p1:
        #     image = cv2.circle(frame, p1t, 1, (255,255,255,0), 2)
        # for p1b in p2:
        #     image = cv2.circle(frame, p1b, 1, (255, 255, 255, 0), 2)

        cv2.namedWindow('Video', cv2.WINDOW_NORMAL)
        cv2.imshow('Video', frame)
    except Exception as e:
        raise(e)

    # Hit 'q' on the keyboard to quit!
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break



video_capture.release()
cv2.destroyAllWindows()

but the distance which i had calculated is varying even if person don't speak.If anyone has idea that how i can detect speaker using month lands marks then please let me know. Thanks

edit retag flag offensive close merge delete

Comments

can you explain your distance formula ? (it looks pretty weird)
what happens before the (dlib) landmarks extraction ? (we can't see from the code you show) its probably quite noisy

berak ( 2019-11-15 04:34:18 -0600 )edit

good luck

LBerger ( 2019-11-15 04:39:33 -0600 )edit

math.sqrt((x2-x1) * * 2+(y2+y1) * * 2) that's the simple formula.

Hassan Ali ( 2019-11-15 04:48:21 -0600 )edit

if at all: math.sqrt((x2-x1)**2 + (y2-y1)**2)

but that's not, what your code is doing.

then, there will be always some distance between the landmarks. to find out, if someone is moving the mouth, you'll need to make a time-series from the distances. (and maybe make some primitive frequency analysis)

berak ( 2019-11-15 05:02:04 -0600 )edit

you probably need to add up 3 distances (3 point pairs), not what you do now, for sure.

berak ( 2019-11-15 06:10:18 -0600 )edit

add a comment

How to Detect Speaker from facial landmarks of mouth using face_recognition

Comments

Links

Question Tools

Stats

Related questions

How to Detect Speaker from facial landmarks of mouth using face_recognition edit

Comments

Links

Question Tools

Stats

Related questions

How to Detect Speaker from facial landmarks of mouth using face_recognition