顔・表情から感情分析するAIアプリの作り方 Emotion API

※サンプル・コード掲載

あらすじ

これは感情知能の高いチャットボットを作るシリーズの第１回目です。

このシリーズでは、ユーザーの感情を読み取るチャットボットを開発します。

第２回目の記事が以下になります。

音声から感情分析するAIアプリの作り方 Empath API

MicrosoftのEmotion APIで表情から感情を検出したり、WebEmpath APIでユーザーの話し方（音声）入力から感情を検出したりする方法を学びます。

サマリー

ユーザーのWebcamの動画をビデオ・エレメントにする。
ビデオストリームの中からユーザーの顔を検出し追跡する。
ライブビデオチャットからユーザーの顔写真を撮る。
顔写真をEmotion APIに送り感情認識をする。

私たちのゴールは単に知性のあるチャットボットを開発することではありません。

感情的知能の高いチャットボットを開発することです。

それは、ユーザーの顔から感情のデータを収集することから始めます。

そこでMicrosoftのEmotionAPIが活躍します。

EmotionAPIとはなんでしょう？

Emotion APIとは？

Emotion APIは画像から感情を認識します。

Face APIと同様、画像の人物の表情を入力として取り込み、そこから見られる一連の感情を返します。

怒り、軽蔑、嫌悪感、恐怖、喜び、中立、悲しみ、驚きなどの感情を検知します。

これらは文化が違っても、特定の表情によって普遍的に伝わる感情として知られています。

顔から感情を読み取るEmotion APIが何なのか理解できたと思います。

次に自動的に画像を取る方法が必要で、それにはTrackingJSを使います。

これは設定に従い、配信された動画を分析し、形や色をトラッキングします。

では、手順を見ていきましょう。

【弊社エンジニア募集中】詳しくはこちら

構築手順

前提条件

Angularについての知識があり、AngularCLIを使ってSPAを作った事がある。
AngularCLIでAngularプロジェクトを作成する為に、全てのDependenciesをインストールしている。
ng new project-nameでプロジェクトを作成。
Microsoft Emotion APIを使うにはサブスクリプションキーが必要です。ここから取得できます。
下記の様にenvironment.prod.ts にサブスクリプションキーを保存。

export const environment = {
  production: true,
  apiKeys: {
    'emotion': 'your-api-key-here'
  }
};

上記の条件が整ったら準備完了です、下記の手順で進めて下さい。

1. 動画をビデオ要素にする

これは簡単なステップです。

1-1

app.component.html を開き videoエレメントをその中に入れます。

テンプレート変数を与えることによりapp.component.tsの中でアクセス出来るようにします。

<video #userVideoStream id="userVideoStream" autoplay></video>

1-2

以下を開きます。

app.component.ts
import { Component, OnInit, ViewChild } from ‘@angular/core’;

1-3

コンポーネントクラスのプロパティを定義します。

videoNativeElement;
@ViewChild(‘userVideoStream’) userVideoStream;

1-4

ngOnInit の中でuserVideoStreamからのnativeElementと共にvideoNativeElementを初期化します。

this.videoNativeElement = <HTMLVideoElement>this.userVideoStream.nativeElement;

1-5

nativeElementをHTMLVideoElementに割り当て、IntelliSenseを取得します。

1-6

最後のステップはon navigator.mediaDevices上の getUserMedia を呼び、video: trueのコンフィグレーションに渡し、promiseが返ってきます。

このpromiseの then ファンクションへのコールバックが、ユーザーのWebcamやフロントカメラ（携帯電話の場合）から streamを返し、それを videoNativeElement の srcObjectとして設定します、コードは下記の様になります。

navigator.mediaDevices.getUserMedia({ video: true }).then(
stream => {
this.videoNativeElement.srcObject = stream;
});

1-7

ng serve を実行してこれを試してください。

Webcamに接続して良いか聞いてきますので、許可すると、Webcamからビデオエレメントへ、ライブストリーム入力をします。

ここまで終えると、app.component.ts と app.component.html は下記の様に見えるはずです。

import { Component, OnInit, ViewChild } from '@angular/core';

@Component({
  selector: 'app-root',
  templateUrl: './app.component.html',
  styleUrls: ['./app.component.css']
})
export class AppComponent implements OnInit{
  videoNativeElement; 
  @ViewChild('userVideoStream') userVideoStream;
  ngOnInit() {
    this.videoNativeElement = this.userVideoStream.nativeElement;
    navigator.mediaDevices.getUserMedia({ video: true }).then(stream => {
       this.videoNativeElement.srcObject = stream;
    });
  }
}

注意：このメソッドはInternet Explorer と他のいくつかの古いバージョンのブラウザではサポートされていません。

2. ビデオの中の人の顔を検出し、画像にする

ここでTrackingJSライブラリを使います、下記の手順で進めて下さい。

2-1

npm install tracking –save

2-2

videoから取る画像をレンダリングするapp.component.html にcanvas エレメントを配置します。

テンプレート変数を入れると下記からアクセスできます。

app.component.ts
<canvas #canvasToRenderUserImage></canvas>

2-3

コンポーネントクラスファイルにプロパティを定義します。
canvasNativeElement; context;

2-4

@ViewChild(‘canvasToRenderUserImage’) canvasToRenderUserImage;

2-5

ngOnInit 内に canvasNativeElement とcontextを下記の様にイニシャライズします。

this.canvasNativeElement = this.canvasToRenderUserImage.nativeElement;
this.context = this.canvasNativeElement.getContext(‘2d’);

2-6

ビデオストリームからのユーザーの画像を描けるCanvasができました。

次にユーザーの顔をトラッキングするために、下記をインポートします。

import ‘tracking/build/tracking’;
import ‘tracking/build/data/face’;

2-7

trackingオブジェクトの初期化エラーを防ぐために、trackingという変数をこれらのインポートの下に定義します。

declare var tracking: any;

2-8

ObjectTrackerという名前の上のtrackingオブジェクトへアクセスできるようになりました。

追跡したいものを指定します、この場合は顔です。

const tracker = new tracking.ObjectTracker(‘face’);

2-9

初期化したトラッカーのためにパラメータを設定します。

それが済んだらTrackerをスタートします、下記のようなコードとなります

tracker.setInitialScale(4);
tracker.setStepSize(2);
tracker.setEdgesDensity(0.1);
tracking.track(‘#userVideoStream’, tracker);

2-10

トラッカーはビデオストリーム中の人の顔の追跡を始めます。

Trackという名のイベントを起動し、ハンドラーとしてコールバック機能をeventオブジェクトを最初のパラメーターとして受け入れるtrackイベントに渡します。

このEvent オブジェクトはその上にdata列があります。

この列の長さはトラッカーが映像に人の顔を認識すれば > 1、もし認識しなければ = 0 です。

フレームにユーザーの顔が入るようにCanvas上にユーザーの顔を描かれるようにしたいので下記の様にします。

tracker.on(‘track’, event => {
if (event.data.length > 0) // capture user image;
});

2-11

ユーザーの画像を取り込むために、context上のdrawImageメソッドを呼び、それを必要な引数に渡します。

this.context.drawImage(this.videoNativeElement, 0, 0, this.canvasNativeElement.width,
this.canvasNativeElement.height);

2-12

ビデオトラックを止めることによりライブストリーミングを止めます。

videoNativeElementのsrcObject上のgetVideoTracksを呼ぶことで止めることができます。

ビデオトラックはそれぞれのトラックのstopメソッドにより手動で止めることができ、以下の様な感じです。

this.videoNativeElement.srcObject.getVideoTracks().forEach(track => track.stop());

2-13

最後にCanvasからjpegを抽出します。

let userImage = this.canvasNativeElement.toDataURL(‘image/jpeg’, 1);

全ての処理が済んだら、app.component.ts と app.component.html はこのように見えるはずです。

import { Component, OnInit, ViewChild } from '@angular/core';

import 'tracking/build/tracking';
import 'tracking/build/data/face';

declare var tracking: any;

@Component({
  selector: 'app-root',
  templateUrl: './app.component.html',
  styleUrls: ['./app.component.css']
})
export class AppComponent implements OnInit{
  videoNativeElement; canvasNativeElement; context;
  @ViewChild('userVideoStream') userVideoStream;
  @ViewChild('canvasToRenderUserImage') canvasToRenderUserImage;
  ngOnInit() {
    this.videoNativeElement = this.userVideoStream.nativeElement;
    navigator.mediaDevices.getUserMedia({ video: true }).then(stream => {
       this.videoNativeElement.srcObject = stream;
    });

    this.canvasNativeElement = this.canvasToRenderUserImage.nativeElement;
    this.context = this.canvasNativeElement.getContext('2d');

    const tracker = new tracking.ObjectTracker('face');
    tracker.setInitialScale(4);
    tracker.setStepSize(2);
    tracker.setEdgesDensity(0.1);
    tracking.track('#userVideoStream', tracker);

    tracker.on('track', event => {
      if (event.data.length > 0) {
        this.context.drawImage(this.videoNativeElement, 0, 0, this.canvasNativeElement.width, this.canvasNativeElement.height);
        this.videoNativeElement.srcObject.getVideoTracks().forEach(track => track.stop());
        let userImage = this.canvasNativeElement.toDataURL('image/jpeg', 1);
      }
    });
  }
}

以上です。このuserImage をEmotionAPIに送ると画像の感情データを受け取ることができます。次のステップで行いましょう。

3. ユーザーの画像をEmotionAPIに送り感情データを受け取る

これは簡単に思えるかもしれませんが、あなたがもしPostリクエストでoctet　streamとして視覚データを送ったことがなければ課題があります。

EmotionAPIはoctet streamでBlobに対応しますがJPEGの画像に対応しないのでフォーマットを変換する必要があるのです。

そのために変換できるサービスを作ります。

3-1

emotionという名のサービスを作ります。appフォルダの中のservicesというフォルダに全てのサービスを入れます。

ng g s services/emotion/emotion

3-2

この感情プログラムの中に、app.component.ts. から呼び出すパブリックメソッドを作ります。

そしてcanvasから作った画像をこれに渡します。

このメソッドはAJAXがEmotionalAPIを呼ぶ操作をします。

また、jpegをoctet streaに変換するプライベートメソッドを内部呼び出します。

このメソッドはHttpClientのpost メソッドへの呼び出しからObservable を返し、下記の様になります。

import { Injectable } from '@angular/core';
import { HttpClient, HttpHeaders } from '@angular/common/http';

import { environment } from './../../../environments/environment.prod';

@Injectable()
export class EmotionService {

  apiUrl: string = 'https://westus.api.cognitive.microsoft.com/emotion/v1.0/recognize';

  constructor(private http: HttpClient) {}

  getUserEmotion(userImageBlob) {
    let headers = new HttpHeaders();
    headers = headers.set('Ocp-Apim-Subscription-Key', environment.apiKeys.emotion);
    headers = headers.set('Content-Type', 'application/octet-stream');
    return this.http.post(this.apiUrl, this.makeBlob(userImageBlob), { headers: headers });
  }

  makeBlob(dataURL) {
    var BASE64_MARKER = ';base64,';
    if (dataURL.indexOf(BASE64_MARKER) == -1) {
      var parts = dataURL.split(',');
      var contentType = parts[0].split(':')[1];
      var raw = decodeURIComponent(parts[1]);
      return new Blob([raw], { type: contentType });
    }
    var parts = dataURL.split(BASE64_MARKER);
    var contentType = parts[0].split(':')[1];
    var raw = window.atob(parts[1]);
    var rawLength = raw.length;

    var uInt8Array = new Uint8Array(rawLength);

    for (var i = 0; i  rawLength; ++i) {
      uInt8Array[i] = raw.charCodeAt(i);
    }

    return new Blob([uInt8Array], { type: contentType });
  }

}

3-3

app.component.ts内のこのObservableをサブスクライブし、レスポンスを得たら感情分析のために役立てることができます。

this._emotionService.getUserEmotion(this.userImage).subscribe(emotionData => { console.log(emotionData); });

終了したらapp.component.htmlの中で変更することはありません。

app.component.ts はこのように見えるはずです。

import { Component, OnInit, ViewChild } from '@angular/core';

import 'tracking/build/tracking';
import 'tracking/build/data/face';

import { EmotionService } from './services/emotion/emotion.service';

declare var tracking: any;

@Component({
  selector: 'app-root',
  templateUrl: './app.component.html',
  styleUrls: ['./app.component.css']
})
export class AppComponent implements OnInit{
  videoNativeElement; canvasNativeElement; context;
  @ViewChild('userVideoStream') userVideoStream;
  @ViewChild('canvasToRenderUserImage') canvasToRenderUserImage;

  constructor(private _emotionService: EmotionService) { }

  ngOnInit() {
    this.videoNativeElement = this.userVideoStream.nativeElement;
    let constraints = { 
      video: {
        width: 1280,
        height: 720
      }
    };
    navigator.mediaDevices.getUserMedia(constraints).then(stream => {
       this.videoNativeElement.srcObject = stream;
    });

    this.canvasNativeElement = this.canvasToRenderUserImage.nativeElement;
    this.context = this.canvasNativeElement.getContext('2d');

    const tracker = new tracking.ObjectTracker('face');
    tracker.setInitialScale(4);
    tracker.setStepSize(2);
    tracker.setEdgesDensity(0.1);
    tracking.track('#userVideoStream', tracker);

    tracker.on('track', event => {
      if (event.data.length > 0) {
        this.context.drawImage(this.videoNativeElement, 0, 0, this.canvasNativeElement.width, this.canvasNativeElement.height);
        this.videoNativeElement.srcObject.getVideoTracks().forEach(track => track.stop());
        let userImage = this.canvasNativeElement.toDataURL('image/jpeg', 1);
        this._emotionService.getUserEmotion(userImage).subscribe(emotionData => { console.log(emotionData); });
      }
    });
  }
}

単純にEmotionAPIからのレスポンスをコンソールに出力しました。

EmotionAPIをどのようにアプリケーションで使うかイメージがつかめたでしょうか？

もし何かエラーが出たり、記事に書き忘れがあったり、手順に問題があったりしたらプロジェクトのサンプルをここで見ることができます。

以下もご参照頂ければと思います。

音声から感情分析するAIアプリの作り方 Empath API