Papers - a llam Collection

llam 's Collections

Papers

updated May 29, 2024

DQR-TTS: Semi-supervised Text-to-speech Synthesis with Dynamic Quantized Representation

Paper • 2311.07965 • Published Nov 14, 2023 • 1
CP-EB: Talking Face Generation with Controllable Pose and Eye Blinking Embedding

Paper • 2311.08673 • Published Nov 15, 2023
CLN-VC: Text-Free Voice Conversion Based on Fine-Grained Style Control and Contrastive Learning with Negative Samples Augmentation

Paper • 2311.08670 • Published Nov 15, 2023
Stock Volatility Prediction Based on Transformer Model Using Mixed-Frequency Data

Paper • 2309.16196 • Published Sep 28, 2023
Sparks of Large Audio Models: A Survey and Outlook

Paper • 2308.12792 • Published Aug 24, 2023
Research on the Impact of Executive Shareholding on New Investment in Enterprises Based on Multivariable Linear Regression Model

Paper • 2309.10986 • Published Sep 20, 2023
A Hierarchy-based Analysis Approach for Blended Learning: A Case Study with Chinese Students

Paper • 2309.10218 • Published Sep 19, 2023
An Empirical Study of Attention Networks for Semantic Segmentation

Paper • 2309.10217 • Published Sep 19, 2023
Contrastive Latent Space Reconstruction Learning for Audio-Text Retrieval

Paper • 2309.08839 • Published Sep 16, 2023
AOSR-Net: All-in-One Sandstorm Removal Network

Paper • 2309.08838 • Published Sep 16, 2023
FastGraphTTS: An Ultrafast Syntax-Aware Speech Synthesis Framework

Paper • 2309.08837 • Published Sep 16, 2023
DiffTalker: Co-driven audio-image diffusion for talking faces via intermediate landmarks

Paper • 2309.07509 • Published Sep 14, 2023
Machine Unlearning Methodology base on Stochastic Teacher Network

Paper • 2308.14322 • Published Aug 28, 2023
Voice Conversion with Denoising Diffusion Probabilistic GAN Models

Paper • 2308.14319 • Published Aug 28, 2023
Symbolic & Acoustic: Multi-domain Music Emotion Modeling for Instrumental Music

Paper • 2308.14317 • Published Aug 28, 2023 • 2
Improving Music Genre Classification from Multi-Modal Properties of Music and Genre Correlations Perspective

Paper • 2303.07667 • Published Mar 14, 2023
EmoMix: Emotion Mixing via Diffusion Models for Emotional Speech Synthesis

Paper • 2306.00648 • Published Jun 1, 2023 • 1
SAR: Self-Supervised Anti-Distortion Representation for End-To-End Speech Model

Paper • 2304.11547 • Published Apr 23, 2023
Dynamic Alignment Mask CTC: Improved Mask-CTC with Aligned Cross Entropy

Paper • 2303.07687 • Published Mar 14, 2023
QI-TTS: Questioning Intonation Control for Emotional Speech Synthesis

Paper • 2303.07682 • Published Mar 14, 2023
Improving EEG-based Emotion Recognition by Fusing Time-frequency And Spatial Representations

Paper • 2303.11421 • Published Mar 14, 2023 • 1
Linguistic-Enhanced Transformer with CTC Embedding for Speech Recognition

Paper • 2210.14725 • Published Oct 25, 2022
Improving Imbalanced Text Classification with Dynamic Curriculum Learning

Paper • 2210.14724 • Published Oct 25, 2022
Semi-Supervised Learning Based on Reference Model for Low-resource TTS

Paper • 2210.14723 • Published Oct 25, 2022
MetaSpeech: Speech Effects Switch Along with Environment for Metaverse

Paper • 2210.13811 • Published Oct 25, 2022
Improving Speech Representation Learning via Speech-level and Phoneme-level Masking Approach

Paper • 2210.13805 • Published Oct 25, 2022
Adapitch: Adaption Multi-Speaker Text-to-Speech Conditioned on Pitch Disentangling with Untranscribed Data

Paper • 2210.13803 • Published Oct 25, 2022
Pre-Avatar: An Automatic Presentation Generation Framework Leveraging Talking Avatar

Paper • 2210.06877 • Published Oct 13, 2022
Boosting Star-GANs for Voice Conversion with Contrastive Discriminator

Paper • 2209.10088 • Published Sep 21, 2022
Tiny-Sepformer: A Tiny Time-Domain Transformer Network for Speech Separation

Paper • 2206.13689 • Published Jun 28, 2022
SUSing: SU-net for Singing Voice Synthesis

Paper • 2205.11841 • Published May 24, 2022
TDASS: Target Domain Adaptation Speech Synthesis Framework for Multi-speaker Low-Resource TTS

Paper • 2205.11824 • Published May 24, 2022
MetaSID: Singer Identification with Domain Adaptation for Metaverse

Paper • 2205.11821 • Published May 24, 2022
Singer Identification for Metaverse with Timbral and Middle-Level Perceptual Features

Paper • 2205.11817 • Published May 24, 2022
MDCNN-SID: Multi-scale Dilated Convolution Network for Singer Identification

Paper • 2004.04371 • Published Apr 9, 2020
Investigation of Singing Voice Separation for Singing Voice Detection in Polyphonic Music

Paper • 2004.04040 • Published Apr 8, 2020
DRVC: A Framework of Any-to-Any Voice Conversion with Self-Supervised Learning

Paper • 2202.10976 • Published Feb 22, 2022
nnSpeech: Speaker-Guided Conditional Variational Autoencoder for Zero-shot Multi-speaker Text-to-Speech

Paper • 2202.10712 • Published Feb 22, 2022
AVQVC: One-shot Voice Conversion by Vector Quantization with applying contrastive learning

Paper • 2202.10020 • Published Feb 21, 2022 • 1
Singer Identification Using Deep Timbre Feature Learning with KNN-Net

Paper • 2102.10236 • Published Feb 20, 2021
TGAVC: Improving Autoencoder Voice Conversion with Text-Guided and Adversarial Training

Paper • 2208.04035 • Published Aug 8, 2022
PMVC: Data Augmentation-Based Prosody Modeling for Expressive Voice Conversion

Paper • 2308.11084 • Published Aug 21, 2023
Medical Speech Symptoms Classification via Disentangled Representation

Paper • 2403.05000 • Published Mar 8, 2024