Latest in cs.mm

total 2178took 0.18s
Towards QoS-Aware RecommendationsJul 15 2019This paper suggests the concept of QoS-aware recommendations for multimedia services/applications. We propose that recommendation systems (RSs) should take into account the expected QoS with which a content can be delivered to a user, to increase the ... More
Gesture-to-Gesture Translation in the Wild via Category-Independent Conditional MapsJul 12 2019Recent works have shown Generative Adversarial Networks (GANs) to be particularly effective in image-to-image translations. However, in tasks such as body pose and hand gesture translation, existing methods usually require precise annotations, e.g. key-points ... More
Beyond Imitation: Generative and Variational Choreography via Machine LearningJul 11 2019Our team of dance artists, physicists, and machine learning researchers has collectively developed several original, configurable machine-learning tools to generate novel sequences of choreography as well as tunable variations on input choreographic sequences. ... More
Sorting Methods and Adaptive Thresholding for Histogram Based Reversible Data HidingJul 11 2019This paper presents a histogram based reversible data hiding (RDH) scheme, which divides image pixels into different cell frequency bands to sort them for data embedding. Data hiding is more efficient in lower cell frequency bands because it provides ... More
Reversible Data Hiding in Encrypted Images using Local Difference of Neighboring PixelsJul 11 2019This paper presents a reversible data hiding in encrypted image (RDHEI), which divides image into non-overlapping blocks. In each block, central pixel of the block is considered as leader pixel and others as follower ones. The prediction errors between ... More
Heard More Than Heard: An Audio Steganography Method Based on GANJul 11 2019Audio steganography is a collection of techniques for concealing the existence of information by embedding it within a non-secret audio, which is referred to as carrier. Distinct from cryptography, the steganography put emphasis on the hiding of the secret ... More
LakhNES: Improving multi-instrumental music generation with cross-domain pre-trainingJul 10 2019We are interested in the task of generating multi-instrumental music scores. The Transformer architecture has recently shown great promise for the task of piano score generation; here we adapt it to the multi-instrumental setting. Transformers are complex, ... More
Video Distortion Method for VMAF Quality Values IncreasingJul 10 2019Video quality measurement takes an important role in many applications. Full-reference quality metrics which are usually used in video codecs comparisons are expected to reflect any changes in videos. In this article, we consider different colour corrections ... More
A New Benchmark and Approach for Fine-grained Cross-media RetrievalJul 10 2019Cross-media retrieval is to return the results of various media types corresponding to the query of any media type. Existing researches generally focus on coarse-grained cross-media retrieval. When users submit an image of "Slaty-backed Gull" as a query, ... More
Learning from History: Recreating and Repurposing Sister Harriet Padberg's Computer Composed Canon and Free FugueJul 10 2019Harriet Padberg wrote Computer-Composed Canon and Free Fugue as part of her 1964 dissertation in Mathematics and Music at Saint Louis University. This program is one of the earliest examples of text-to-music software and algorithmic composition, which ... More
BASN -- Learning Steganography with Binary Attention MechanismJul 09 2019Secret information sharing through image carrier has aroused much research attention in recent years with images' growing domination on the Internet and mobile applications. However, with the booming trend of convolutional neural networks, image steganography ... More
On the Security and Applicability of Fragile Camera FingerprintsJul 09 2019Camera sensor noise is one of the most reliable device characteristics in digital image forensics, enabling the unique linkage of images to digital cameras. This so-called camera fingerprint gives rise to different applications, such as image forensics ... More
Barriers towards no-reference metrics application to compressed video quality analysis: on the example of no-reference metric NIQEJul 08 2019This paper analyses the application of no-reference metric NIQE to the task of video-codec comparison. A number of issues in the metric behaviour on videos was detected and described. The metric has outlying scores on black and solid-coloured frames. ... More
TrackNet: A Deep Learning Network for Tracking High-speed and Tiny Objects in Sports ApplicationsJul 08 2019Ball trajectory data are one of the most fundamental and useful information in the evaluation of players' performance and analysis of game strategies. Although vision-based object tracking techniques have been developed to analyze sport competition videos, ... More
An Experimental-based Review of Image Enhancement and Image Restoration Methods for Underwater ImagingJul 07 2019Underwater images play a key role in ocean exploration, but often suffer from severe quality degradation due to light absorption and scattering in water medium. Although major breakthroughs have been made recently in the general area of image enhancement ... More
Informative Visual Storytelling with Cross-modal RulesJul 07 2019Existing methods in the Visual Storytelling field often suffer from the problem of generating general descriptions, while the image contains a lot of meaningful contents remaining unnoticed. The failure of informative story generation can be concluded ... More
Synchronizing Audio-Visual Film Stimuli in Unity (version 5.5.1f1): Game Engines as a Tool for ResearchJul 05 2019Unity is a software specifically designed for the development of video games. However, due to its programming possibilities and the polyvalence of its architecture, it can prove to be a versatile tool for stimuli presentation in research experiments. ... More
Extraction and Analysis of Fictional Character Networks: A SurveyJul 05 2019A character network is a graph extracted from a narrative, in which vertices represent characters and edges correspond to interactions between them. A number of narrative-related problems can be addressed automatically through the analysis of character ... More
Zero-shot Learning for Audio-based Music Classification and TaggingJul 05 2019Audio-based music classification and tagging is typically based on categorical supervised learning with a fixed set of labels. This intrinsically cannot handle unseen labels such as newly added music genres or semantic words that users arbitrarily choose ... More
Blind Image Quality Assessment Using A Deep Bilinear Convolutional Neural NetworkJul 05 2019We propose a deep bilinear model for blind image quality assessment (BIQA) that handles both synthetic and authentic distortions. Our model consists of two convolutional neural networks (CNN), each of which specializes in one distortion scenario. For ... More
The DKU Replay Detection System for the ASVspoof 2019 Challenge: On Data Augmentation, Feature Representation, Classification, and FusionJul 05 2019This paper describes our DKU replay detection system for the ASVspoof 2019 challenge. The goal is to develop spoofing countermeasure for automatic speaker recognition in physical access scenario. We leverage the countermeasure system pipeline from four ... More
Intrinsic Image Popularity AssessmentJul 03 2019The goal of research in image popularity assessment (IPA) is to develop computational models that can automatically predict the potential of a social image being popular over the Internet. Here, we aim to single out the contribution of visual content ... More
Intrinsic Image Popularity AssessmentJul 03 2019Jul 04 2019The goal of research in automatic image popularity assessment (IPA) is to develop computational models that can accurately predict the potential of a social image to go viral on the Internet. Here, we aim to single out the contribution of visual content ... More
MIDI-Sandwich: Multi-model Multi-task Hierarchical Conditional VAE-GAN networks for Symbolic Single-track Music GenerationJul 02 2019Jul 04 2019Most existing neural network models for music generation explore how to generate music bars, then directly splice the music bars into a song. However, these methods do not explore the relationship between the bars, and the connected song as a whole has ... More
MIDI-Sandwich: Multi-model Multi-task Hierarchical Conditional VAE-GAN networks for Symbolic Single-track Music GenerationJul 02 2019Most existing neural network models for music generation explore how to generate music bars, then directly splice the music bars into a song. However, these methods do not explore the relationship between the bars, and the connected song as a whole has ... More
Adaptive Music Composition for GamesJul 02 2019The generation of music that adapts dynamically to content and actions has an important role in building more immersive, memorable and emotive game experiences. To date, the development of adaptive music systems for video games is limited by both the ... More
Universal audio synthesizer control with normalizing flowsJul 01 2019The ubiquity of sound synthesizers has reshaped music production and even entirely defined new music genres. However, the increasing complexity and number of parameters in modern synthesizers make them harder to master. Hence, the development of methods ... More
Effects of Foraging in Personalized Content-based Image RecommendationJun 30 2019A major challenge of recommender systems is to help users locating interesting items. Personalized recommender systems have become very popular as they attempt to predetermine the needs of users and provide them with recommendations to personalize their ... More
frame attention networks for facial expression recognition in videosJun 29 2019The video-based facial expression recognition aims to classify a given video into several basic emotions. How to integrate facial features of individual frames is crucial for this task. In this paper, we propose the Frame Attention Networks (FAN), to ... More
Music Performance Analysis: A SurveyJun 29 2019Music Information Retrieval (MIR) tends to focus on the analysis of audio signals. Often, a single music recording is used as representative of a "song" even though different performances of the same song may reveal different properties. A performance ... More
Rhythm Dungeon: A Blockchain-based Music Roguelike GameJun 28 2019Rhythm Dungeon is a rhythm game which leverages the blockchain as a shared open database. During the gaming session, the player explores a roguelike dungeon by inputting specific sequences in time to music rhythm. By integrating smart contract to the ... More
Non-user Inclusive Design for Maintaining Harmony of Real-Virtual Human Interaction in Augmented RealityJun 28 2019Augmented reality enables the illusion of contents such as objects and humans in the virtual world co-existing with users in the real world. However, non-users who are not aware of the presence of the virtual world and dynamically move nearby might either ... More
PRNU Based Source Camera Attribution for Image Sets Anonymized with Patch-Match AlgorithmJun 27 2019Patch-Match is an efficient algorithm used for structural image editing and available as a tool on popular commercial photo-editing software. The tool allows users to insert or remove objects from photos using information from similar scene content. Recently, ... More
Representation Learning of Music Using Artist, Album, and Track InformationJun 27 2019Supervised music representation learning has been performed mainly using semantic labels such as music genres. However, annotating music with semantic labels requires time and cost. In this work, we investigate the use of factual metadata such as artist, ... More
Pooled Steganalysis in JPEG: how to deal with the spreading strategy?Jun 27 2019In image pooled steganalysis, a steganalyst, Eve, aims to detect if a set of images sent by a steganographer, Alice, to a receiver, Bob, contains a hidden message. We can reasonably assess that the steganalyst does not know the strategy used to spread ... More
A novel music-based game with motion capture to support cognitive and motor function in the elderlyJun 25 2019This paper presents a novel game prototype that uses music and motion detection as preventive medicine for the elderly. Given the aging populations around the globe, and the limited resources and staff able to care for these populations, eHealth solutions ... More
Cross-Channel Correlation Preserved Three-Stream Lightweight CNNs for DemosaickingJun 24 2019Demosaicking is a procedure to reconstruct full RGB images from Color Filter Array (CFA) samples, none of which has all color components available. Recent deep Convolutional Neural Networks (CNN) based models have obtained state of the art accuracy on ... More
Had You Looked Where I'm Looking: Cross-user Similarities in Viewing Behavior for 360$^{\circ}$ Video and Caching ImplicationsJun 24 2019The demand and usage of 360$^{\circ}$ video services are expected to increase. However, despite these services being highly bandwidth intensive, not much is known about the potential value that basic bandwidth saving techniques such as server or edge-network ... More
Cross-Platform Modeling of Users' Behavior on Social MediaJun 23 2019With the booming development and popularity of mobile applications, different verticals accumulate abundant data of user information and social behavior, which are spontaneous, genuine and diversified. However, each platform describes user's portraits ... More
The Shape of RemiXXXes to Come: Audio Texture Synthesis with Time-frequency ScatteringJun 21 2019This article explains how to apply time-frequency scattering, a convolutional operator extracting modulations in the time--frequency domain at different rates and scales, to the re-synthesis and manipulation of audio textures. After implementing phase ... More
Zero-shot Learning and Knowledge Transfer in Music Classification and TaggingJun 20 2019Music classification and tagging is conducted through categorical supervised learning with a fixed set of labels. In principle, this cannot make predictions on unseen labels. Zero-shot learning is an approach to solve the problem by using side information ... More
Understanding, Categorizing and Predicting Semantic Image-Text RelationsJun 20 2019Two modalities are often used to convey information in a complementary and beneficial manner, e.g., in online news, videos, educational resources, or scientific publications. The automatic understanding of semantic correlations between text and associated ... More
Probabilistic Tile Visibility-Based Server-Side Rate Adaptation for Adaptive 360-Degree Video StreamingJun 20 2019In this paper, we study the server-side rate adaptation problem for streaming tile-based adaptive 360-degree videos to multiple users who are competing for transmission resources at the network bottleneck. Specifically, we develop a convolutional neural ... More
QoE-Aware Resource Allocation for Crowdsourced Live Streaming: A Machine Learning ApproachJun 20 2019Driven by the tremendous technological advancement of personal devices and the prevalence of wireless mobile network accesses, the world has witnessed an explosion in crowdsourced live streaming. Ensuring a better viewers quality of experience (QoE) is ... More
A Monaural Speech Enhancement Method for Robust Small-Footprint Keyword SpottingJun 20 2019Robustness against noise is critical for keyword spotting (KWS) in real-world environments. To improve the robustness, a speech enhancement front-end is involved. Instead of treating the speech enhancement as a separated preprocessing before the KWS system, ... More
Enhancement of Underwater Images with Statistical Model of Background Light and Optimization of Transmission MapJun 19 2019Underwater images often have severe quality degradation and distortion due to light absorption and scattering in the water medium. A hazed image formation model is widely used to restore the image quality. It depends on two optical parameters: the background ... More
Multimodal Abstractive Summarization for How2 VideosJun 19 2019In this paper, we study abstractive summarization for open-domain videos. Unlike the traditional text news summarization, the goal is less to "compress" text information but rather to provide a fluent textual summary of information that has been collected ... More
Recent Advances of Image Steganography with Generative Adversarial NetworksJun 18 2019In the past few years, the Generative Adversarial Network (GAN) which proposed in 2014 has achieved great success. GAN has achieved many research results in the field of computer vision and natural language processing. Image steganography is dedicated ... More
ParNet: Position-aware Aggregated Relation Network for Image-Text matchingJun 17 2019Exploring fine-grained relationship between entities(e.g. objects in image or words in sentence) has great contribution to understand multimedia content precisely. Previous attention mechanism employed in image-text matching either takes multiple self ... More
Relevance Feedback with Latent Variables in Riemann spacesJun 15 2019In this paper we develop and evaluate two methods for relevance feedback based on endowing a suitable "semantic query space" with a Riemann metric derived from the probability distribution of the positive samples of the feedback. The first method uses ... More
Audio-Based Music Classification with DenseNet And Data AugmentationJun 15 2019In recent years, deep learning technique has received intense attention owing to its great success in image recognition. A tendency of adaption of deep learning in various information processing fields has formed, including music information retrieval ... More
A Holistic Survey of Wireless Multipath Video StreamingJun 14 2019Most of today's mobile devices are equipped with multiple network interfaces and one of the main bandwidth-hungry applications that would benefit from multipath communications is wireless video streaming. However, most of current transport protocols do ... More
Deep Learning Development Environment in Virtual RealityJun 13 2019Virtual reality (VR) offers immersive visualization and intuitive interaction. We leverage VR to enable any biomedical professional to deploy a deep learning (DL) model for image classification. While DL models can be powerful tools for data analysis, ... More
Blockchain Games: A SurveyJun 13 2019With the support of the blockchain systems, the cryptocurrency has changed the world of virtual assets. Digital games, especially those with massive multi-player scenarios, will be significantly impacted by this novel technology. However, there are insufficient ... More
A Security Case Study for Blockchain GamesJun 13 2019Blockchain gaming is an emerging entertainment paradigm. However, blockchain games are still suffering from security issues, due to the immature blockchain technologies and its unsophisticated developers. In this work, we analyzed the blockchain game ... More
Grounding Object Detections With TranscriptionsJun 13 2019A vast amount of audio-visual data is available on the Internet thanks to video streaming services, to which users upload their content. However, there are difficulties in exploiting available data for supervised statistical models due to the lack of ... More
Eye Contact Correction using Deep Neural NetworksJun 12 2019In a typical video conferencing setup, it is hard to maintain eye contact during a call since it requires looking into the camera rather than the display. We propose an eye contact correction model that restores the eye contact regardless of the relative ... More
Differential Imaging ForensicsJun 12 2019We introduce some new forensics based on differential imaging, where a novel category of visual evidence created via subtle interactions of light with a scene, such as dim reflections, can be computationally extracted and amplified from an image of interest ... More
Stereoscopic Omnidirectional Image Quality Assessment Based on Predictive Coding TheoryJun 12 2019Objective quality assessment of stereoscopic omnidirectional images is a challenging problem since it is influenced by multiple aspects such as projection deformation, field of view (FoV) range, binocular vision, visual comfort, etc. Existing studies ... More
Cross-Modal Interaction Networks for Query-Based Moment Retrieval in VideosJun 06 2019Query-based moment retrieval aims to localize the most relevant moment in an untrimmed video according to the given natural language query. Existing works often only focus on one aspect of this emerging task, such as the query representation learning, ... More
CNN-based Steganalysis and Parametric Adversarial Embedding: a Game-Theoretic FrameworkJun 03 2019CNN-based steganalysis has recently achieved very good performance in detecting content-adaptive steganography. At the same time, recent works have shown that, by adopting an approach similar to that used to build adversarial examples, a steganographer ... More
Supervised Online Hashing via Similarity Distribution LearningMay 31 2019Online hashing has attracted extensive research attention when facing streaming data. Most online hashing methods, learning binary codes based on pairwise similarities of training instances, fail to capture the semantic relationship, and suffer from a ... More
TS-RNN: Text Steganalysis Based on Recurrent Neural NetworksMay 30 2019With the rapid development of natural language processing technologies, more and more text steganographic methods based on automatic text generation technology have appeared in recent years. These models use the powerful self-learning and feature extraction ... More
Automatic Realistic Music Video Generation from Segments of Youtube VideosMay 29 2019A Music Video (MV) is a video aiming at visually illustrating or extending the meaning of its background music. This paper proposes a novel method to automatically generate, from an input music track, a music video made of segments of Youtube music videos ... More
Parametric context adaptive Laplace distribution for multimedia compressionMay 28 2019Jul 01 2019Data compression often subtracts predictor and encodes the difference (residue) assuming Laplace distribution, for example for images, videos, audio, or numerical data. Its performance is strongly dependent on proper choice of width (scale parameter) ... More
Online Learning for Robust Adaptive Video Streaming in Mobile NetworksMay 28 2019In this paper, we propose a novel algorithm for video quality adaptation in HTTP Adaptive Streaming (HAS), based on Online Convex Optimization (OCO). The proposed algorithm, named Learn2Adapt (L2A), is shown to provide a robust adaptation strategy which, ... More
Towards robust audio spoofing detection: a detailed comparison of traditional and learned featuresMay 28 2019Automatic speaker verification, like every other biometric system, is vulnerable to spoofing attacks. Using only a few minutes of recorded voice of a genuine client of a speaker verification system, attackers can develop a variety of spoofing attacks ... More
Towards robust audio spoofing detection: a detailed comparison of traditional and learned featuresMay 28 2019Jun 19 2019Automatic speaker verification, like every other biometric system, is vulnerable to spoofing attacks. Using only a few minutes of recorded voice of a genuine client of a speaker verification system, attackers can develop a variety of spoofing attacks ... More
EncryptGAN: Image Steganography with Domain TransformMay 28 2019We propose an image steganographic algorithm called EncryptGAN, which disguises private image communication in an open communication channel. The insight is that content transform between two very different domains (e.g., face to flower) allows one to ... More
EncryptGAN: Image Steganography with Domain TransformMay 28 2019May 29 2019We propose an image steganographic algorithm called EncryptGAN, which disguises private image communication in an open communication channel. The insight is that content transform between two very different domains (e.g., face to flower) allows one to ... More
Temporal Attentive Alignment for Video Domain AdaptationMay 26 2019Jun 04 2019Although various image-based domain adaptation (DA) techniques have been proposed in recent years, domain shift in videos is still not well-explored. Most previous works only evaluate performance on small-scale datasets which are saturated. Therefore, ... More
Temporal Attentive Alignment for Video Domain AdaptationMay 26 2019Jun 07 2019Although various image-based domain adaptation (DA) techniques have been proposed in recent years, domain shift in videos is still not well-explored. Most previous works only evaluate performance on small-scale datasets which are saturated. Therefore, ... More
Technical Report of the DAISY System -- Shooter Localization, Models, Interface, and BeyondMay 26 2019Nowadays a huge number of user-generated videos are uploaded to social media every second, capturing glimpses of events all over the world. These videos provide important and useful information for reconstructing the events. In this paper, we describe ... More
Technical Report of the Video Event Reconstruction and Analysis (VERA) System - Shooter Localization, Models, Interface, and BeyondMay 26 2019Jun 08 2019Every minute, hundreds of hours of video are uploaded to social media sites and the Internet from around the world. This material creates a visual record of the experiences of a significant percentage of humanity and can help illuminate how we live in ... More
Technical Report of the Video Event Reconstruction and Analysis (VERA) System -- Shooter Localization, Models, Interface, and BeyondMay 26 2019Jul 02 2019Every minute, hundreds of hours of video are uploaded to social media sites and the Internet from around the world. This material creates a visual record of the experiences of a significant percentage of humanity and can help illuminate how we live in ... More
Technical Report of the Video Event Reconstruction and Analysis (VERA) System - Shooter Localization, Models, Interface, and BeyondMay 26 2019Jun 03 2019Every minute, hundreds of hours of video are uploaded to social media sites and the Internet from around the world. This material creates a visual record of the experiences of a significant percentage of humanity and can help illuminate how we live in ... More
Technical Report of the Video Event Reconstruction and Analysis (VERA) System -- Shooter Localization, Models, Interface, and BeyondMay 26 2019Jul 05 2019Every minute, hundreds of hours of video are uploaded to social media sites and the Internet from around the world. This material creates a visual record of the experiences of a significant percentage of humanity and can help illuminate how we live in ... More
Saliency detection based on structural dissimilarity induced by image quality assessment modelMay 24 2019The distinctiveness of image regions is widely used as the cue of saliency. Generally, the distinctiveness is computed according to the absolute difference of features. However, according to the image quality assessment (IQA) studies, the human visual ... More
Bridging Dialogue Generation and Facial Expression SynthesisMay 24 2019Spoken dialogue systems that assist users to solve complex tasks such as movie ticket booking have become an emerging research topic in artificial intelligence and natural language processing areas. With a well-designed dialogue system as an intelligent ... More
Bridging Dialogue Generation and Facial Expression SynthesisMay 24 2019May 28 2019Spoken dialogue systems that assist users to solve complex tasks such as movie ticket booking have become an emerging research topic in artificial intelligence and natural language processing areas. With a well-designed dialogue system as an intelligent ... More
Speech2Face: Learning the Face Behind a VoiceMay 23 2019How much can we infer about a person's looks from the way they speak? In this paper, we study the task of reconstructing a facial image of a person from a short audio recording of that person speaking. We design and train a deep neural network to perform ... More
An Improved Reversible Data Hiding in Encrypted Images using Parametric Binary Tree LabelingMay 23 2019This work proposes an improved reversible data hiding scheme in encrypted images using parametric binary tree labeling(IPBTL-RDHEI), which takes advantage of the spatial correlation in the entire original image not in small image blocks to reserve room ... More
Effects of Packet Loss and Jitter on VoLTE Call QualityMay 22 2019This work performs a preliminary, comparative analysis of the end-to-end quality guaranteed by Voice over LTE (VoLTE), examining several millions of VoLTE calls that employ two popular speech audio codecs, namely, Adaptive Multi-Rate (AMR) and Adaptive ... More
Multiple reconstruction compression framework based on PNG imageMay 22 2019It is shown that neural networks (NNs) achieve excellent performances in image compression and reconstruction. However, there are still many shortcomings in the practical application, which eventually lead to the loss of neural network image processing ... More
Image Encryption Algorithm Based on Facebook Social NetworkMay 21 2019Facebook is the online social networks (OSNs) platform with the largest number of users in the world today, information protection based on Facebook social network platform have important practical significance. Since the information users share on social ... More
Predicting TED Talk Ratings from Language and ProsodyMay 21 2019We use the largest open repository of public speaking---TED Talks---to predict the ratings of the online viewers. Our dataset contains over 2200 TED Talk transcripts (includes over 200 thousand sentences), audio features and the associated meta information ... More
Evaluation of 4D Light Field Compression MethodsMay 17 2019Light field data records the amount of light at multiple points in space, captured e.g. by an array of cameras or by a light-field camera that uses microlenses. Since the storage and transmission requirements for such data are tremendous, compression ... More
Reactive Video Caching via long-short-term fusion approachMay 16 2019Video caching has been a basic network functionality in today's network architectures. Although the abundance of caching replacement algorithms has been proposed recently, these methods all suffer from a key limitation: due to their immature rules, inaccurate ... More
EVSO: Environment-aware Video Streaming Optimization of Power ConsumptionMay 16 2019Streaming services gradually support high-quality videos for better user experience. However, streaming high-quality video on mobile devices consumes a considerable amount of energy. This paper presents the design and prototype of EVSO, which achieves ... More
Food Recommendation: Framework, Existing Solutions and ChallengesMay 15 2019A growing proportion of the global population is becoming overweight or obese, leading to various diseases (e.g., diabetes, ischemic heart disease and even cancer) due to unhealthy eating patterns, such as increased intake of food with high energy and ... More
Statistical Learning Based Congestion Control for Real-time Video CommunicationMay 15 2019With the increasing demands on interactive video applications, how to adapt video bit rate to avoid network congestion has become critical, since congestion results in self-inflicted delay and packet loss which deteriorate the quality of real-time video ... More
Statistical Learning Based Congestion Control for Real-time Video CommunicationMay 15 2019May 16 2019With the increasing demands on interactive video applications, how to adapt video bit rate to avoid network congestion has become critical, since congestion results in self-inflicted delay and packet loss which deteriorate the quality of real-time video ... More
SmartBullets: A Cloud-Assisted Bullet Screen Filter based on Deep LearningMay 15 2019Bullet-screen is a technique that enables the website users to send real-time comment `bullet' cross the screen. Compared with the traditional review of a video, bullet-screen provides new features of feeling expression to video watching and more iterations ... More
Learning to Groove with Inverse Sequence TransformationsMay 14 2019We explore models for translating abstract musical ideas (scores, rhythms) into expressive performances using Seq2Seq and recurrent Variational Information Bottleneck (VIB) models. Though Seq2Seq models usually require painstakingly aligned corpora, we ... More
High Capacity Lossless Data Hiding for JPEG images by NLCM Relationship ConstrcutionMay 14 2019In this paper, we propose a high capacity lossless data hiding (LDH) scheme that achieves high embedding capacity and keeps the image quality unchanged. In JPEG bitstream, Huffman coding is adopted to encode image data. In fact, some Huffman codes are ... More
Expression Conditional GAN for Facial Expression-to-Expression TranslationMay 14 2019In this paper, we focus on the facial expression translation task and propose a novel Expression Conditional GAN (ECGAN) which can learn the mapping from one image domain to another one based on an additional expression attribute. The proposed ECGAN is ... More
Reversible data hiding based on reducing invalid shifting of pixels in histogram shiftingMay 14 2019In recent years, reversible data hiding (RDH), a new research hotspot in the field of information security, has been paid more and more attention by researchers. Most of the existing RDH schemes do not fully take it into account that natural image's texture ... More
FPGA-based Binocular Image Feature Extraction and Matching SystemMay 13 2019Image feature extraction and matching is a fundamental but computation intensive task in machine vision. This paper proposes a novel FPGA-based embedded system to accelerate feature extraction and matching. It implements SURF feature point detection and ... More
FPGA-based Binocular Image Feature Extraction and Matching SystemMay 13 2019May 14 2019Image feature extraction and matching is a fundamental but computation intensive task in machine vision. This paper proposes a novel FPGA-based embedded system to accelerate feature extraction and matching. It implements SURF feature point detection and ... More
Group Re-identification via Transferred Single and Couple Representation LearningMay 13 2019Group re-identification (G-ReID) is an important yet less-studied task. Its challenges not only lie in appearance changes of individuals which have been well-investigated in general person re-identification (ReID), but also derive from group layout and ... More