天琴实验室 - QQ音乐

The extensive music recognition technology refers to the basic technologies of audio fingerprint, cover song recognition, humming recognition, singing ASR, singer tone recognition, and so on. under the relevant scenes of music retrieval, music appreciation, and music understanding. Through continuous efforts in multi-dimensional understanding, recognition and retrieval of recording, lyrics, songs and other content based on song audios, Lyra Lab has created a complete recognition technology matrix. Its application scenarios cover the whole process of music life cycle, including not only the downstream music listening and recognition, humming recognition, and local music management, the midstream music library management, but also the upstream content creation, copyright monitoring and so on. The extensive music recognition technology has been widely applied to QQ Music, WeSing, Kuwo Music, Xiaomi Music and Tencent Musicians.
The main achievements of Lyra Lab's extensive music recognition technology include:

With quick response to user needs and continuous in-depth improvement of technical details, the audio fingerprint technology won the first place in the audio fingerprint competition of MIREX in 2019;
The LyraC-Net technology and singer tone recognition technology are included in Interspeech 2022 and IJCN 2021, and achieve SOTA effect in released academic papers;
The industry's first cover segment technology in the scene of music recognition;
The continuous application of the innovative singing ASR technology reflects a wide range of application scenarios;
With continuous deepening and exploration, singing evaluation technology is expanded and innovated based on business scenarios, and referenced evaluation based on metric learning is included in ISMIR 2021.

Music Retrieval

Audio Fingerprint

Audio fingerprint checks the similarity of song recording levels by extracting the basic audio features of the songs.

Typical application scenarios:
Music recognition, local music and picture matching, upload of cover versions, detection, and so on.

Cover Song Recognition

Unlike the audio fingerprint technology that checks the similarity between the recording levels of the songs, cover song recognition mainly checks the similarity of the main melody levels of the songs.

Typical application scenarios:
Cover recognition module, song detection and so on in music recognition technology.

Query by Humming

Query by Humming also analyzes the similarity of the main melody levels of songs. However, the analyzed object of humming recognition is not the complete songs, but the humming by the user and the song in the library.

Typical application scenarios:
Humming recognition of QQ music.

Singing ASR

Singing ASR identifies the lyrics in the song, or identifies the phonemes of the lyrics.

Typical application scenarios:
Generate lyrics, check the similarity of lyrics between songs, and search according to the lyrics after identifying the lyrics.

Music Appreciation

Music Library Content

The music library evaluation dimensions include the uploaded cover versions/original marks, duplicated segment detection, music quality assessment and cover/original marks.
UGC Content

Conduct evaluation on the songs uploaded by users. The evaluation dimensions include low-quality/uploaded cover version recognition, singing evaluation, timbre classification, high-quality work recognition, and so on.
Live Show Content

Appreciate the content of the audio live stream. The evaluation dimensions include whether the live show is active or not, singing evalution, quality evalution.

Music Understanding

Automatic Lyrics-to-Audio Alignment

Generate and force align lyrics (including but not limited to QRC format) for song with word-level timestamps and line-level timestamps.

Intelligent Caption Recognition

Produce accurate live captions from Talk show or other speech audios by Automatic Speech Recognition (ASR) engines. These captions with timestamps can be dynamically rolled in mobile apps.

Audio Super-Resolution

Increases the missing samples within a low-resolution signal such as speech or music using deep convolutional neural networks.

MIRLAB Song Analysis

Analyze the basic attributes of songs, such as genre, chorus position, beat distribution, BPM, audio embedding, main melody extraction, and sound source separation.

Contact us

contact

Telephone: (0755)8601 3388 - 863574

Email:lyracobar@tencentmusic.com

working time

working day: 10:30 --- 20:30

Common Problems 中文 Privacy Agreement (English) Contact Us

Lyra-CS Dataset

Lyra-QBH Dataset

Lyra-SA Dataset