Latest in cs.mm

total 1201took 0.35s
Similarity measures for vocal-based drum sample retrieval using deep convolutional auto-encodersFeb 14 2018The expressive nature of the voice provides a powerful medium for communicating sonic ideas, motivating recent research on methods for query by vocalisation. Meanwhile, deep learning methods have demonstrated state-of-the-art results for matching vocal ... More
Learning to score and summarize figure skating sport videosFeb 08 2018This paper focuses on fully understanding the figure skating sport videos. In particular, we present a large-scale figure skating sport video dataset, which include 500 figure skating videos. On average, the length of each video is 2 minute and 50 seconds. ... More
Fine-Grained Land Use Classification at the City Scale Using Ground-Level ImagesFeb 07 2018We perform fine-grained land use mapping at the city scale using ground-level images. Mapping land use is considerably more difficult than mapping land cover and is generally not possible using overhead imagery as it requires close-up views and seeing ... More
Computer-Aided Annotation for Video Tampering Dataset of Forensic ResearchFeb 07 2018The annotation of video tampering dataset is a boring task that takes a lot of manpower and financial resources. At present, there is no published literature which is capable to improve the annotation efficiency of forged videos. We presented a computer-aided ... More
The New Modality: Emoji Challenges in Prediction, Anticipation, and RetrievalJan 30 2018Feb 02 2018Over the past decade, emoji have emerged as a new and widespread form of digital communication, spanning diverse social networks and spoken languages. We propose to treat these ideograms as a new modality in their own right, distinct in their semantic ... More
Text Extraction and Retrieval from Smartphone Screenshots: Building a Repository for Life in MediaJan 04 2018Daily engagement in life experiences is increasingly interwoven with mobile device use. Screen capture at the scale of seconds is being used in behavioral studies and to implement "just-in-time" health interventions. The increasing psychological breadth ... More
Field Studies with Multimedia Big Data: Opportunities and Challenges (Extended Version)Dec 28 2017Social multimedia users are increasingly sharing all kinds of data about the world. They do this for their own reasons, not to provide data for field studies-but the trend presents a great opportunity for scientists. The Yahoo Flickr Creative Commons ... More
Towards Structured Analysis of Broadcast Badminton VideosDec 23 2017Sports video data is recorded for nearly every major tournament but remains archived and inaccessible to large scale data mining and analytics. It can only be viewed sequentially or manually tagged with higher-level labels which is time consuming and ... More
Beautiful and damned. Combined effect of content quality and social ties on user engagementNov 01 2017User participation in online communities is driven by the intertwinement of the social network structure with the crowd-generated content that flows along its links. These aspects are rarely explored jointly and at scale. By looking at how users generate ... More
Fast MPEG-CDVS Encoder with GPU-CPU Hybrid ComputingMay 27 2017Jun 09 2017The compact descriptors for visual search (CDVS) standard from ISO/IEC moving pictures experts group (MPEG) has succeeded in enabling the interoperability for efficient and effective image retrieval by standardizing the bitstream syntax of compact feature ... More
Query-adaptive Video Summarization via Quality-aware Relevance EstimationMay 01 2017Sep 28 2017Although the problem of automatic video summarization has recently received a lot of attention, the problem of creating a video summary that also highlights elements relevant to a search query has been less studied. We address this problem by posing query-relevant ... More
Content-Based Video Retrieval in Historical Collections of the German Broadcasting ArchiveFeb 13 2017The German Broadcasting Archive (DRA) maintains the cultural heritage of radio and television broadcasts of the former German Democratic Republic (GDR). The uniqueness and importance of the video material stimulates a large scientific interest in the ... More
Towards computer-assisted understanding of dynamics in symphonic musicDec 07 2016Many people enjoy classical symphonic music. Its diverse instrumentation makes for a rich listening experience. This diversity adds to the conductor's expressive freedom to shape the sound according to their imagination. As a result, the same piece may ... More
Binary Subspace Coding for Query-by-Image Video RetrievalDec 06 2016The query-by-image video retrieval (QBIVR) task has been attracting considerable research attention recently. However, most existing methods represent a video by either aggregating or projecting all its frames into a single datum point, which may easily ... More
A novel Adaptive weighted Kronecker Compressive SensingDec 04 2016Recently, multidimensional signal reconstruction using a low number of measurements is of great interest. Therefore, an effective sampling scheme which should acquire the most information of signal using a low number of measurements is required. In this ... More
Algorithmic Songwriting with ALYSIADec 04 2016This paper introduces ALYSIA: Automated LYrical SongwrIting Application. ALYSIA is based on a machine learning model using Random Forests, and we discuss its success at pitch and rhythm prediction. Next, we show how ALYSIA was used to create original ... More
Energy-efficient 8-point DCT Approximations: Theory and Hardware ArchitecturesDec 02 2016Due to its remarkable energy compaction properties, the discrete cosine transform (DCT) is employed in a multitude of compression standards, such as JPEG and H.265/HEVC. Several low-complexity integer approximations for the DCT have been proposed for ... More
Fast Supervised Discrete Hashing and its AnalysisNov 30 2016In this paper, we propose a learning-based supervised discrete hashing method. Binary hashing is widely used for large-scale image retrieval as well as video and document searches because the compact representation of binary code is essential for data ... More
Prediction of Video Popularity in the Absence of Reliable Data from Video Hosting Services: Utility of Traces Left by Users on the WebNov 28 2016With the growth of user-generated content, we observe the constant rise of the number of companies, such as search engines, content aggregators, etc., that operate with tremendous amounts of web content not being the services hosting it. Thus, aiming ... More
A Second Order Derivatives based Approach for SteganographyNov 25 2016Steganography schemes are designed with the objective of minimizing a defined distortion function. In most existing state of the art approaches, this distortion function is based on image feature preservation. Since smooth regions or clean edges define ... More
MOMOS-MT: Mobile Monophonic System for Music TranscriptionNov 22 2016Music holds a significant cultural role in social identity and in the encouragement of socialization. Technology, by the destruction of physical and cultural distance, has lead to many changes in musical themes and the complete loss of forms. Yet, it ... More
CAS-CNN: A Deep Convolutional Neural Network for Image Compression Artifact SuppressionNov 22 2016Lossy image compression algorithms are pervasively used to reduce the size of images transmitted over the web and recorded on data storage media. However, we pay for their high compression rate with visual artifacts degrading the user experience. Deep ... More
Exploiting Web Images for Dataset Construction: A Domain Robust ApproachNov 22 2016Labelled image datasets have played a critical role in high-level image understanding; however the process of manual labelling is both time-consuming and labor intensive. To reduce the cost of manual labelling, there has been increased research interest ... More
Image Credibility Analysis with Effective Domain Transferred Deep NetworksNov 16 2016Numerous fake images spread on social media today and can severely jeopardize the credibility of online content to public. In this paper, we employ deep networks to learn distinct fake image related features. In contrast to authentic images, fake images ... More
Zero-resource Machine Translation by Multimodal Encoder-decoder Network with Multimedia PivotNov 14 2016We propose an approach to build a neural machine translation system with no supervised resources (i.e., no parallel corpora) using multimodal embedded representation over texts and images. Based on the assumption that text documents are often likely to ... More
Columbia MVSO Image Sentiment DatasetNov 14 2016The Multilingual Visual Sentiment Ontology (MVSO) consists of 15,600 concepts in 12 different languages that are strongly related to emotions and sentiments expressed in images. These concepts are defined in the form of Adjective-Noun Pair (ANP), which ... More
Leveraging Video Descriptions to Learn Video Question AnsweringNov 12 2016We propose a scalable approach to learn video-based question answering (QA): answer a "free-form natural language question" about a video content. Our approach automatically harvests a large number of videos and descriptions freely available online. Then, ... More
Show me the material evidence: Initial experiments on evaluating hypotheses from user-generated multimedia dataNov 11 2016Subjective questions such as `does neymar dive', or `is clinton lying', or `is trump a fascist', are popular queries to web search engines, as can be seen by autocompletion suggestions on Google, Yahoo and Bing. In the era of cognitive computing, beyond ... More
Large-scale JPEG steganalysis using hybrid deep-learning frameworkNov 10 2016Deep learning frameworks have achieved overwhelming superiority in many fields of pattern recognition in recent years. However, the application of deep learning frameworks in image steganalysis is still in its initial stage. In this paper we firstly proved ... More
Recover Subjective Quality Scores from Noisy MeasurementsNov 06 2016Simple quality metrics such as PSNR are known to not correlate well with subjective quality when tested across a wide spectrum of video content or quality regime. Recently, efforts have been made in designing objective quality metrics trained on subjective ... More
Phase Shift Keying on the Hypersphere: Power-Efficient MIMO CommunicationsNov 03 2016We present Phase Shift Keying on the Hypersphere (PSKH), a generalization of conventional Phase Shift Keying (PSK) for Multiple-Input Multiple-Output (MIMO) systems, where data constellations are distributed over a multidimensional hypersphere. The use ... More
Phase Shift Keying on the Hypersphere: Power-Efficient MIMO CommunicationsNov 03 2016Nov 22 2016We present Phase Shift Keying on the Hypersphere (PSKH), a generalization of conventional Phase Shift Keying (PSK) for Multiple-Input Multiple-Output (MIMO) systems, where data constellations are distributed over a multidimensional hypersphere. The use ... More
QoE-based MAC Layer Optimization for Video Teleconferencing over WiFiNov 03 2016In IEEE 802.11, the retry limit is set the same value for all packets. In this paper, we dynamically classify video teleconferencing packets based on the type of the video frame that a packet carries and the packet loss events that have happened in the ... More
Steearable Discrete Cosine TransformOct 28 2016In image compression, classical block-based separable transforms tend to be inefficient when image blocks contain arbitrarily shaped discontinuities. For this reason, transforms incorporating directional information are an appealing alternative. In this ... More
Cross-Modal Scene NetworksOct 27 2016People can recognize scenes across many different modalities beyond natural images. In this paper, we investigate how to learn cross-modal scene representations that transfer across modalities. To study this problem, we introduce a new cross-modal scene ... More
A Novel Boundary Matching Algorithm for Video Temporal Error ConcealmentOct 25 2016With the fast growth of communication networks, the video data transmission from these networks is extremely vulnerable. Error concealment is a technique to estimate the damaged data by employing the correctly received data at the decoder. In this paper, ... More
QoE-aware Scalable Video Transmission in MIMO~SystemsOct 24 2016An important concept in wireless systems has been quality of experience (QoE)-aware video transmission. Such communications are considered not only connection-based communications but also content-aware communications, since the video quality is closely ... More
An Efficient Adaptive Boundary Matching Algorithm for Video Error ConcealmentOct 24 2016Sending compressed video data in error-prone environments (like the Internet and wireless networks) might cause data degradation. Error concealment techniques try to conceal the received data in the decoder side. In this paper, an adaptive boundary matching ... More
A Classification Engine for Image Ballistics of Social DataOct 20 2016Image Forensics has already achieved great results for the source camera identification task on images. Standard approaches for data coming from Social Network Platforms cannot be applied due to different processes involved (e.g., scaling, compression, ... More
Steganography between Silence Intervals of Audio in Video Content Using Chaotic MapsOct 14 2016Steganography is the art of hiding data, in such a way that it is undetectable under traffic-pattern analysis and the data hidden is only known to the receiver and the sender. In this paper new method of text steganography over the silence interval of ... More
Open-Ended Visual Question-AnsweringOct 09 2016This thesis report studies methods to solve Visual Question-Answering (VQA) tasks with a Deep Learning framework. As a preliminary step, we explore Long Short-Term Memory (LSTM) networks used in Natural Language Processing (NLP) to tackle Question-Answering ... More
Saliency-Guided Complexity Control for HEVC DecodingOct 08 2016The latest High Efficiency Video Coding (HEVC) standard significantly improves coding efficiency over its previous video coding standards. The expense of such improvement is enormous computational complexity, from both encoding and decoding perspectives. ... More
Perceptually-Driven Video Coding with the Daala Video CodecOct 08 2016The Daala project is a royalty-free video codec that attempts to compete with the best patent-encumbered codecs. Part of our strategy is to replace core tools of traditional video codecs with alternative approaches, many of them designed to take perceptual ... More
mPDF: Framework for Watermarking PDF Files using Image Watermarking AlgorithmsOct 07 2016The advancement in digital technologies have made it possible to produce perfect copies of digital content. In this environment, malicious users reproduce the digital content and share it without compensation to the content owner. Content owners are concerned ... More
Backward-Shifted Coding (BSC) based on Scalable Video Coding for HASOct 07 2016The main task of HTTP Adaptive Streaming is to adapt video quality dynamically under variable network conditions. This is a key feature for multimedia delivery especially when quality of service cannot be granted network-wide and, e.g., throughput may ... More
Deep Image Aesthetics Classification using Inception Modules and Fine-tuning Connected LayerOct 07 2016In this paper we investigate the image aesthetics classification problem, aka, automatically classifying an image into low or high aesthetic quality, which is quite a challenging problem beyond image recognition. Deep convolutional neural network (DCNN) ... More
MoveSteg: A Method of Network Steganography DetectionOct 06 2016This article presents a new method for detecting a source point of time based network steganography - MoveSteg. A steganography carrier could be an example of multimedia stream made with packets. These packets are then delayed intentionally to send hidden ... More
Application Layer Congestion Control for Network-Aware Telehaptic CommunicationOct 03 2016In a real-world network, shared by several users, telehaptic applications involving delay-sensitive multimedia communication between remote locations demand distinct Quality of Service (QoS) guarantees for different media components. These QoS constraints ... More
Adaptive 360 VR Video Streaming: Divide and Conquer!Sep 28 2016Oct 18 2016While traditional multimedia applications such as games and videos are still popular, there has been a significant interest in the recent years towards new 3D media such as 3D immersion and Virtual Reality (VR) applications, especially 360 VR videos. ... More
Adaptive 360 VR Video Streaming: Divide and Conquer!Sep 28 2016While traditional multimedia applications such as games and videos are still popular, there has been a significant interest in the recent years towards new 3D media such as 3D immersion and Virtual Reality (VR) applications, especially 360 VR videos. ... More
Adaptive 360 VR Video Streaming: Divide and Conquer!Sep 28 2016Oct 17 2016While traditional multimedia applications such as games and videos are still popular, there has been a significant interest in the recent years towards new 3D media such as 3D immersion and Virtual Reality (VR) applications, especially 360 VR videos. ... More
Viewport-Adaptive Navigable 360-Degree Video DeliverySep 26 2016The delivery and display of ultra high resolution 360-degree videos on Head-Mounted Displays (HMDs) presents a number of technical challenges. 360-degree videos are high resolution spherical videos that contain an omnidirectional view of the scene, however ... More
Location-Based and Audience-Aware StorytellingSep 26 2016While the daily user of digital, Internet-enabled devices has some explicit control over what they read and see, the providers fulfilling searches, offering options, and presenting material are using increasingly sophisticated real-time algorithms that ... More
Low-complexity Image and Video Coding Based on an Approximate Discrete Tchebichef TransformSep 24 2016The usage of linear transformations has great relevance for data decorrelation applications, like image and video compression. In that sense, the discrete Tchebichef transform (DTT) possesses useful coding and decorrelation properties. The DTT transform ... More
Deep Quality: A Deep No-reference Quality Assessment SystemSep 22 2016Image quality assessment (IQA) continues to garner great interest in the research community, particularly given the tremendous rise in consumer video capture and streaming. Despite significant research effort in IQA in the past few decades, the area of ... More
Deep Learning for Video Classification and CaptioningSep 22 2016Accelerated by the tremendous increase in Internet bandwidth and storage space, video data has been generated, published and spread explosively, becoming an indispensable part of today's big data. In this paper, we focus on reviewing two lines of research ... More
Spatio-Temporal Sentiment Hotspot Detection Using Geotagged PhotosSep 21 2016We perform spatio-temporal analysis of public sentiment using geotagged photo collections. We develop a deep learning-based classifier that predicts the emotion conveyed by an image. This allows us to associate sentiment with place. We perform spatial ... More
Land Use Classification using Convolutional Neural Networks Applied to Ground-Level ImagesSep 21 2016Land use mapping is a fundamental yet challenging task in geographic science. In contrast to land cover mapping, it is generally not possible using overhead imagery. The recent, explosive growth of online geo-referenced photo collections suggests an alternate ... More
Multimedia Communication Quality Assessment TestbedsSep 21 2016We make an intensive use of multimedia frameworks in our research on modeling the perceived quality estimation in streaming services and real-time communications. In our preliminary work, we have used the VLC VOD software to generate reference audiovisual ... More
Towards the bio-personalization of music recommendation systems: A single-sensor EEG biomarker of subjective music preferenceSep 21 2016Recent advances in biosensors technology and mobile electroencephalographic (EEG) interfaces have opened new application fields for cognitive monitoring. A computable biomarker for the assessment of spontaneous aesthetic brain responses during music listening ... More
Minimizing Compression Artifacts for High Resolutions with Adaptive Quantization Matrices for HEVCSep 21 2016Visual Display Units (VDUs), capable of displaying video data at High Definition (HD) and Ultra HD (UHD) resolutions, are frequently employed in a variety of technological domains. Quantization-induced video compression artifacts, which are usually unnoticeable ... More
A Consumer BCI for Automated Music Evaluation Within a Popular On-Demand Music Streaming Service - Taking Listener's Brainwaves to ExtremesSep 20 2016Sep 30 2016We investigated the possibility of using a machine-learning scheme in conjunction with commercial wearable EEG-devices for translating listener's subjective experience of music into scores that can be used for the automated annotation of music in popular ... More
A Consumer BCI for Automated Music Evaluation Within a Popular On-Demand Music Streaming Service - Taking Listener's Brainwaves to ExtremesSep 20 2016We investigated the possibility of using a machine-learning scheme in conjunction with commercial wearable EEG-devices for translating listener's subjective experience of music into scores that can be used for the automated annotation of music in popular ... More
FPGA implementation of the procedures for video quality assessmentSep 20 2016Video resolutions used in variety of media are constantly rising. While manufacturers struggle to perfect their screens it also important to ensure high quality of displayed image. Overall quality can be measured using Mean Opinion Score (MOS). Video ... More
An Approach for Self-Training Audio Event Detectors Using Web DataSep 20 2016Sep 23 2016Audio event detection in the era of Big Data has the constraint of lacking annotations to train robust models that match the scale of class diversity. This is mainly due to the expensive and time-consuming process of manually annotating sound events in ... More
Deep CTR Prediction in Display AdvertisingSep 20 2016Click through rate (CTR) prediction of image ads is the core task of online display advertising systems, and logistic regression (LR) has been frequently applied as the prediction model. However, LR model lacks the ability of extracting complex and intrinsic ... More
Generalized residual vector quantization for large scale dataSep 17 2016Vector quantization is an essential tool for tasks involving large scale data, for example, large scale similarity search, which is crucial for content-based information retrieval and analysis. In this paper, we propose a novel vector quantization framework ... More
Color-Based Coding Unit Level Adaptive Quantization for HEVCSep 15 2016HEVC HM 16 includes a Coding Unit (CU) level perceptual quantization technique named AdaptiveQP. AdaptiveQP adjusts the Quantization Parameter (QP) at the CU level based on the spatial activity of samples in the four constituent NxN sub-blocks of the ... More
Color-Based Coding Unit Level Adaptive Quantization for HEVCSep 15 2016Nov 06 2016HEVC HM 16 includes a Coding Unit (CU) level perceptual quantization technique named AdaptiveQP. AdaptiveQP adjusts the Quantization Parameter (QP) at the CU level based on the spatial activity of samples in the four constituent NxN sub-blocks of the ... More
Convolutional Recurrent Neural Networks for Music ClassificationSep 14 2016We introduce a convolutional recurrent neural network (CRNN) for music tagging. CRNNs take advantage of convolutional neural networks (CNNs) for local feature extraction and recurrent neural networks for temporal summarisation of the extracted features. ... More
Convolutional Recurrent Neural Networks for Music ClassificationSep 14 2016Nov 03 2016We introduce a convolutional recurrent neural network (CRNN) for music tagging. CRNNs take advantage of convolutional neural networks (CNNs) for local feature extraction and recurrent neural networks for temporal summarisation of the extracted features. ... More
Optimal Representations for Adaptive Streaming in Interactive Multi-View Video SystemsSep 14 2016Interactive multi-view video streaming (IMVS) services permit to remotely immerse within a 3D scene. This is possible by transmitting a set of reference camera views (anchor views), which are used by the clients to freely navigate in the scene and possibly ... More
Active Canny: Edge Detection and Recovery with Open Active Contour ModelsSep 12 2016We introduce an edge detection and recovery framework based on open active contour models (snakelets). This is motivated by the noisy or broken edges output by standard edge detection algorithms, like Canny. The idea is to utilize the local continuity ... More
A Tube-and-Droplet-based Approach for Representing and Analyzing Motion TrajectoriesSep 10 2016Trajectory analysis is essential in many applications. In this paper, we address the problem of representing motion trajectories in a highly informative way, and consequently utilize it for analyzing trajectories. Our approach first leverages the complete ... More
To Click or Not To Click: Automatic Selection of Beautiful Thumbnails from VideosSep 06 2016Thumbnails play such an important role in online videos. As the most representative snapshot, they capture the essence of a video and provide the first impression to the viewers; ultimately, a great thumbnail makes a video more attractive to click and ... More
Towards Hybrid Cloud-assisted Crowdsourced Live Streaming: Measurement and AnalysisAug 31 2016Crowdsourced Live Streaming (CLS), most notably Twitch.tv, has seen explosive growth in its popularity in the past few years. In such systems, any user can lively broadcast video content of interest to others, e.g., from a game player to many online viewers. ... More
On the Efficiency and Fairness of Multiplayer HTTP-based Adaptive Video StreamingAug 29 2016User-perceived quality-of-experience (QoE) is critical in internet video delivery systems. Extensive prior work has studied the design of client-side bitrate adaptation algorithms to maximize single-player QoE. However, multiplayer QoE fairness becomes ... More
Human Action Recognition without HumanAug 29 2016The objective of this paper is to evaluate "human action recognition without human". Motion representation is frequently discussed in human action recognition. We have examined several sophisticated options, such as dense trajectories (DT) and the two-stream ... More
Applying Topological Persistence in Convolutional Neural Network for Music Audio SignalsAug 26 2016Recent years have witnessed an increased interest in the application of persistent homology, a topological tool for data analysis, to machine learning problems. Persistent homology is known for its ability to numerically characterize the shapes of spaces ... More
YouSkyde: Information Hiding for Skype Video TrafficAug 25 2016In this paper a new information hiding method for Skype videoconference calls - YouSkyde - is introduced. A Skype traffic analysis revealed that introducing intentional losses into the Skype video traffic stream to provide the means for clandestine communication ... More
Title Generation for User Generated VideosAug 25 2016Sep 08 2016A great video title describes the most salient event compactly and captures the viewer's attention. In contrast, video captioning tends to generate sentences that describe the video as a whole. Although generating a video title automatically is a very ... More
Automatic Synchronization of Multi-User Photo GalleriesAug 24 2016In this paper we address the issue of photo galleries synchronization, where pictures related to the same event are collected by different users. Existing solutions to address the problem are usually based on unrealistic assumptions, like time consistency ... More
A Convolutional Neural Network Approach for Post-Processing in HEVC Intra CodingAug 24 2016Lossy image and video compression algorithms yield visually annoying artifacts including blocking, blurring, and ringing, especially at low bit-rates. To reduce these artifacts, post-processing techniques have been extensively studied. Recently, inspired ... More
A Convolutional Neural Network Approach for Post-Processing in HEVC Intra CodingAug 24 2016Oct 29 2016Lossy image and video compression algorithms yield visually annoying artifacts including blocking, blurring, and ringing, especially at low bit-rates. To reduce these artifacts, post-processing techniques have been extensively studied. Recently, inspired ... More
Steganalyzer performances in operational contextsAug 20 2016Steganography and steganalysis are two important branches of the information hiding field of research. Steganography methods consist in hiding information in such a way that the secret message is undetectable for the uninitiated. Steganalyzis encompasses ... More
MT3S: Mobile Turkish Scene Text-to-Speech System for the Visually ImpairedAug 17 2016Reading text is one of the essential needs of the visually impaired people. We developed a mobile system that can read Turkish scene and book text, using a fast gradient-based multi-scale text detection algorithm for real-time operation and Tesseract ... More
Globally Variance-Constrained Sparse Representation for Image Set CompressionAug 17 2016Sparse representation presents an efficient approach to approximately recover a signal by the linear composition of a few bases from a learnt dictionary, based on which various successful applications have been observed. However, in the scenario of data ... More
Towards Music Captioning: Generating Music Playlist DescriptionsAug 17 2016Descriptions are often provided along with recommendations to help users' discovery. Recommending automatically generated music playlists (e.g. personalised playlists) introduces the problem of generating descriptions. In this paper, we propose a method ... More
An image compression and encryption scheme based on deep learningAug 16 2016Oct 09 2016Stacked Auto-Encoder (SAE) is a kind of deep learning algorithm for unsupervised learning. Which has multi layers that project the vector representation of input data into a lower vector space. These projection vectors are dense representations of the ... More
An image compression and encryption scheme based on deep learningAug 16 2016Stacked Auto-Encoder (SAE) is a kind of deep learning algorithm for unsupervised learning. Which has multi layers that project the vector representation of input data into a lower vector space. These projection vectors are dense representations of the ... More
Detecting Vanishing Points in Natural Scenes with Application in Photo Composition AnalysisAug 15 2016Linear perspective is widely used in landscape photography to create the impression of depth on a 2D photo. Automated understanding of the use of linear perspective in landscape photography has a number of real-world applications, including aesthetics ... More
Multi-View Product Image Search Using ConvNets FeaturesAug 11 2016Multi-view queries on a multi-view product image database with bag-of-visual words (BoWs) have been shown to improve the average precision significantly compared to traditional single view queries on single view databases. In this paper, we investigate ... More
Mining Fashion Outfit Composition Using An End-to-End Deep Learning Approach on Set DataAug 10 2016Fashion composition involves deep understanding of fashion standards while incorporating creativity for choosing multiple fashion items (e.g., Jewelry, Bag, Pants, Dress). In fashion websites, popular or high-quality fashion compositions are usually designed ... More
StegIbiza: New Method for Information Hiding in Club MusicAug 09 2016In this paper a new method for information hiding in club music is introduced. The method called StegIbiza is based on using the music tempo as a carrier. The tempo is modulated by hidden messages with a 3-value coding scheme, which is an adoption of ... More
Semi-Fragile Image Authentication based on CFD and 3-Bit QuantizationAug 08 2016There is a great adventure of watermarking usage in the context of conventional authentication since it does not require additional storage space for supplementary metadata. However JPEG compression, being a conventional method to compress images, leads ... More
Detecting Sarcasm in Multimodal Social PlatformsAug 08 2016Sarcasm is a peculiar form of sentiment expression, where the surface sentiment differs from the implied sentiment. The detection of sarcasm in social media platforms has been applied in the past mainly to textual utterances where lexical indicators (such ... More
Daala: Building A Next-Generation Video Codec From Unconventional TechnologyAug 05 2016Daala is a new royalty-free video codec that attempts to compete with state-of-the-art royalty-bearing codecs. To do so, it must achieve good compression while avoiding all of their patented techniques. We use technology that is as different as possible ... More
Media Query Processing For The Internet-of-Things: Coupling Of Device Energy Consumption And Cloud Infrastructure BillingAug 02 2016Audio/visual recognition and retrieval applications have recently garnered significant attention within Internet-of-Things (IoT) oriented services, given that video cameras and audio processing chipsets are now ubiquitous even in low-end embedded systems. ... More
PicHunt: Social Media Image Retrieval for Improved Law EnforcementAug 02 2016Sep 15 2016First responders are increasingly using social media to identify and reduce crime for well-being and safety of the society. Images shared on social media hurting religious, political, communal and other sentiments of people, often instigate violence and ... More
Skipping Selected Steps of DWT Computation in Lossless JPEG 2000 for Improved BitratesAug 01 2016In order to improve bitrates of lossless JPEG 2000, we propose to modify the discrete wavelet transform (DWT) by skipping selected steps of its computation. We employ a heuristic to construct the skipped steps DWT (SS-DWT) in an image-adaptive way and ... More