A Unified Music Recommender System Using
Users’ Listening Habits and Semantics of Tags
Hyon Hee Kim
Department of Statistics and Information Science,
Dongduk Women’s University
Outline
• Motivation & Objectives
• Overview of the System
• Generation of User Profiles
• A Unified Music Recommendation
• Performance Evaluation
• Related Work
• Conclusions and Future Work
Motivation (1/3)
• In a Social Music Site
– Music recommendation is essential.
– Music recommendation is different from other product recommendation
• Explicit information : Rating system
• Implicit information : the number of plays
• Listening habits-based User Profiling
– Cold Start Problem
• A new users with little information
• A new items with only a few ratings
– Data Sparsity Problem
• Data is very small compared to needed music items
Classic rock
british
pop
rock
• Collaborative Tagging
– A tool for users to represent their preferences about web resources
– Users add keywords which are freely chosen by themselves to web resources
– Using tag data for user profiling in personalized recommender systems
• Tag-based User Profiling
– More Easily added tags without listening to music
– Semantically meaningful tags
Motivation (2/3)
Motivation (3/3)
• In the case of last.fm
• Factual Tags
– 85% of tags
– genre, region, instrumentation
• Emotional Tags
– 10% of tags
– opinion, sentiment, mood
• Personal Tags
– 5% of tags
– to organize, to browse, etc.
Objectives
• A Novel Approach to Music Recommendation
– Combining listening habits and semantics of tags
• Using a Tag Ontology and an Emotion Ontology
– UniTag: Resolving semantic ambiguity of tags
– UniEmotion: Assigning weighted values to the emotional tags
→ Semantically Enhanced Music Recommendation
Outline
• Motivation & Objectives
• Overview of the System
• Generation of User Profiles
• A Unified Music Recommendation
• Performance Evaluation
• Related Work
• Conclusions and Future Work
Overview of the System
Outline
• Motivation & Objectives
• Overview of the System
• Tag-based User Profiling
– Preprocessing of tags
– Algorithms for generating user profiles
– Preliminary experimental results
• A Unified Music Recommendation
• Performance Evaluation
• Related Work
• Conclusions and Future Work
Preprocessing of Tags (1/3)
• A tag does not have any pre-defined term or hierarchies of a term
• Problems of tag data
– Synonymy
• Different words represents the same meaning
• E.g., hiphop, hip-hop, hip hop/ R & B, Rhythm and Blues, Blues
– Polysemy
• A single word contains multiple meanings
• E.g., French => French rock, French pop, French artist
– Spelling variants
• misspelling
• Foreign language
Preprocessing of Tags (2/3)
• Tag Ontology
– Tags, users, items
• UniTag Ontology
– uniTag:Users
• uniTag:userID, uniTag:hasAdded, uniTag:hasAddedTo
– uniTag:Items
• uniTag:itemID
– uniTag:Tags
• uniTag:tagID, uniTag:tagName, uniTag:RTag, uniTag:subTag,
• uniTag:Rtags {rock, hiphop, electronic, metal, jazz, rap, funk, folk, blues, reggae}
• uniTag:classifiedAs, uniTag:isKindOf, uniTag:istheSameAs, uniTag:tagVariation
Preprocessing of Tags (3/3)
• Rules for reasoning prefix
– French rock, progressive rock, post rock=> rock
(Tag (?t) ^ tagPrefix (?t, ?p) ^ Prefix(?p) ^ subTag(?t, ?s) ^ Rtags (?s) ->
classifiedAs (?t, ?s)
• Rules for reasoning expert knowledge
– Soul => rhythm and blues, rhythm and blues => blues then Soul => blues
(Tag (?t) ^ isKindof (?t, ?A) ^ isKindof (?A, ?B) -> isKindof (?t, ?B)
• Rules for reasoning synonym
– Hip-hop, hiphop => hip hop
(Tag(?t) ^tagVariation (?t, ?R) ^ istheSameAs (?t, ?s) -> tagVariation (?s, ?R)
Algorithm for Generating User Profiles (1/2)
Algorithm 1. Generation of A Tag-based Profile
Input: set of Representative tags Tr, set of a user’s tag Tu
Output: set of frequencey for each representative tag of the user FTr
var RTags[] = {rock, hiphop, electronic, metal, jazz, rap, funk, folk, blues, reggae}
var tagFrequency[] = { }, tempFrequency [] = { }
var RTag = null
while ∃next tag t in Tu do
RTag = FindRTag (t)
If Rtag == RTags [i] then
{ tempFrequency[i] = tempFrequency[i] + 1
tagFrequency [i] = tempFrequency [i] }
else
tagFrequency [i] = tempFrequency [i]
endwhile rock hiphop electronic metal jazz rap funk folk blues reggae
user1 6 2 2 3 2 4 3 1 1 1
user2 5 0 0 0 0 0 0 0 1 0
user3 2 2 1 1 1 1 2 0 0 1
user4 10 1 0 1 2 0 2 3 3 1
user5 1 4 0 0 0 4 1 0 0 0
Table 1. An example of tag-based profiles
Algorithm for generating User Profiles (2/2)
Algorithm 2. Generation of A Track-based Profile
Input: set of tracks of a usr TRu, set of Representative tags Tr
Output: set of number of a user’s tracks for each representative musical genre Tn
var RTags[] = {rock, hiphop, electronic, metal, jazz, rap, funk, folk, blues, reggae}
var numTrack[ ] = { }, tempnumTrack [ ] = { }
var RTrack = null
while ∃next tag t in Tu do
RTrack = FindGenre (t)
If Rtrack == RTags [i] then
{ tempnumTrack [i] = tempnumTrack[i] + 1
numTrack[i] = tempnumTrack [i] }
else
numTrack [i] = tempnumTrack [i]
endwhile rock hiphop electronic metal jazz rap funk folk blues reggae
User1 65 176 5 4 0 168 0 3 0 0
User2 411 8 11 109 3 5 8 1 0 0
User3 157 7 11 10 6 2 1 39 4 2
User4 257 20 9 18 2 5 0 9 0 0
User5 110 277 15 8 6 85 10 3 2 7
Table 2. An example of track-based profiles
Preliminary Experimental Results (1/3)
• 1,000 user data set from Last.fm
– Users, tags, music items
• Standardization
– To remove extensive preference
• K-Means clustering algorithm
– Canopy Clustering
– 6 centroid points and 6 clusters
Preliminary Experimental Results (2/3)
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
Cluster1 0.241 1.472 0.626 0.130 1.267 1.621 2.168 0.274 1.078 0.381
Cluster2 2.171 0.032 0.517 3.052 0.011 -0.030 0.328 1.533 1.245 0.162
Cluster3 -0.206 -0.273 -0.517 -0.178 -0.180 -0.294 -0.233 -0.171 -0.204 -0.136
Cluster4 -0.341 0.660 -0.459 -0.284 -0.208 1.178 -0.179 -0.321 -0.166 0.273
Cluster5 -0.074 -0.155 1.320 -0.230 -0.115 -0.261 -0.209 -0.070 -0.172 -0.071
Cluster6 2.815 7.640 5.168 -0.136 9.254 6.135 7.000 4.286 4.421 5.254
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
Cluster1 -0.411 0.495 0.406 -0.338 1.565 0.131 1.632 -0.135 0.147 0.812
Cluster2 0.200 -0.444 0.007 -0.341 0.907 -0.468 -0.288 2.617 1.097 0.020
Cluster3 -0.897 1.651 -0.539 -0.442 -0.213 1.836 0.059 -0.507 -0.415 0.034
Cluster4 1.925 -0.590 -0.404 0.852 -0.264 -0.491 0.655 -0.002 2.850 -0.108
Cluster5 0.914 -0.557 -0.216 0.794 -0.296 -0.511 -0.297 0.014 -0.157 -0.147
Cluster6 -0.472 -0.327 0.380 -0.373 -0.184 -0.371 -0.241 -0.205 -0.300 -0.093
Table 3. Values of Centers of Tag-based Profiles
Table 4. Values of Centers of Track-based Profiles
• Clustering Validity
– Inter-cluster distances
– Distances between all pairs of centroids using cosine distance measure
Preliminary Experimental Results (3/3)
– T-test
• Mean of inter-cluster distances of tag-based profiles
• Mean of inter-cluster distances of track-based profiles
N Mean Std Dev t p-value
Tag-based profiles 15 0.8325 0.6834
2.55 0.0165
Track-based profiles 15 0.3785 0.0885
Table 5. T-test result for the means of inter-cluster distances
Outline
• Motivation & Objectives
• Overview of the System
• Generation of User Profiles
• A Unified Music Recommendation
– UniEmotion Ontology
– Generation of User Profiles
– Music Recommendation Algorithm
• Performance Evaluation
• Related Work
• Conclusions and Future Work
UniEmotion Ontology (1/5)
[Plutchik’s model]
UniEmotion Ontology (2/5)
P: 0.625, O: 0.25, N: 0.125
P: 0.375, O: 0.625, N: 0
P: 1.0, O: 0, N: 0
• Definition of the intensity of emotional tags
• SentiWordNet, http://sentiwordnet.isti.cnr.it/
UniEmotion Ontology (3/5)
• Intensity of emotional tags
– Strong
• Positive value >= 0.75 or Negative value>= 0.75
– Middle
• 0.25 <= Positive value <= 0.75 or
• 0.25 <= Negative value <= 0.75
– Weak
• Positive value < 0.25 and Negative value < 0.25
UniEmotion Ontology (4/5)
• Assigning the weights to the tags
– Factual tags: 1
– Positive tags
• Strong: 2.5
• Middle: 2
• Weak: 1.5
– Negative tags
• Strong: -2.5
• Middle: -2
• Weak: -1.5
• Final score of an item => sum of the weights
UniEmotion Ontology (5/5)
• Two classes
– UniEmotion:Positive
• Emotional tags belonging to the positive emotional categories
• trust, surprise, anticipation, and happiness
– UniEmotion:Negative
• Emotional tags belonging to the negative emotional categories
• disgust, anger, fear, and sadness
• Two properties
– UniEmotion:Intensity
• Specifying the intensity of tags
– UniEmotion:Weight
• Specifying the weight of tags
Generation of User Profiles (1/2)
1. Listening habits-based User Profiles
– U1 = {u1, u2, …, um}, I1 = {i1, i2, …, in},
– <u, I, n>
• N: number of plays
2. Tag score-based User Profiles
– U2 = {u1, u2, …, um}, I2 = {i1, i2, …, in},
– <u, I, s>
• S: scores of tags assigned by UniEmotion ontology
3. Hybrid User Profiles
– U3 = {u1, u2, …, um}, I3 = I1 ∩ I2,
– <u, I, m>
• M = α * n +(1- α) * s; α = 0.5
Generation of User Profiles (2/2)
1. Listening habits-based
User profiles
2. Tag score-based
User profiles
3. Hybrid
User profiles
Music Recommendation Algorithm (1/2)
• Finding Similar Users
– Pearson Correlation Similarity
• Calculating scores of items
– Considering the similar users’ rates
• Recommending top n items
Music Recommendation Algorithm (2/2)
Input: a set of user profiles UP
Output: a set of recommended items RI
1. For all yi ∈ U
Compute a similarity s between X and yi.
2. Sort by similarity
3. Select top n neighbors
4.
5. For all
Compute a similarity t between x and
For all
preference +=t * pref
6. Rank by preference
7. Select top n items
Outline
• Motivation & Objectives
• Overview of the System
• Generation of User Profiles
• A Unified Music Recommendation
• Performance Evaluation
• Related Work
• Conclusions and Future Work
Performance Evaluation
• Implementation Environment: Apache Web Server
– User database : MySQL 5.0
– Listening habits collector, tag score generator: PHP
– Recommendation Engine: Apache Mahout
– UniTag and UniEmotion Ontology: JDK6.0
• Experimental Data
– 1, 000 user information from last.fm [http://mir.dcs.gla.ac.uk/]
– Containing 18,700 artist and 12,600 tags
– 70% training data, 30% test data
Performance Evaluation
• Evaluation Model
– Recommended items
• Items which users are interested in (True Positive, TP)
• Items which users are not (False Positive, FP)
– Items which are not recommended
• Items which users are interested in (False Negative, FN)
• Items which users are not interested in (True Negative, TN)
– Precision P = TP/ TP+ FP
• # of correct recommendation/# of all recommended items
– Recall R = TP / TP+FN
• # of correct recommendation/# of preferred items
– F-measure F = 2* P* R / P+R
• Harmonic average between precision and recall
Experimental Results (1/3)
• Precisions
[Number of similar users] [Number of recommended items]
A: Listening habits-based approach
B: Tag-based approach
C: Hybrid approach
Experimental Results (2/3)
• Recalls
[Number of similar users] [Number of recommended items]
A: Listening habits-based approach
B: Tag-based approach
C: Hybrid approach
Experimental Results (3/3)
• F-measure
[Number of similar users] [Number of recommended items]
A: Listening habits-based approach
B: Tag-based approach
C: Hybrid approach
Statistical Validation
• One-way ANOVA about three groups
– Method1: listening habits-based approach
– Method2: tag-based approach
– Method3: hybrid approach
• Tukey Multiple Comparison Test
– Asymmetric distributions
• Log transformation
– Different characters in case two groups have significant
difference
Method 1 2 3 F
Mean of log(prec) -3.962B -4.036B -2.879A 34.27***
Mean
Precision(SD)
0.020
(0.006)
0.020
(0.009)
0.068
(0.040)
N 24 24 24
Method 1 2 3 F
Mean of log(recall) -3.285B -4.099c -2.635A 26.80***
Mean
Recall (SD)
0.044
(0.023)
0.019
(0.010)
0.093
(0.056)
N 24 24 24
<Table1. test for precision> ***: p<0.001
<Table2. test for recall> ***:p<0.001
Method 1 2 3 F
Mean of log(F-measure) -3.748B -4.117c -2.894A 41.31***
Mean
F-measure (SD)
0.024
(0.006)
0.018
(0.008)
0.06
(0.034)
N 24 24 24
<Table2. test for F-measure> ***: p<0.001
Related Work
• MusicBox
– A personalized music recommender system based on social tags
– 3-order tensors model
– The method improves the recommendation quality
• Foafing the music
– Collecting music information in a semantic web environment
– User information, music information, concert information
– Recommendation of similar music items
• OntoEmotions
– An ontology of emotional categories covering the basic emotions
– Armeteo art portal
– New relations can be inferred by reasoning on the ontology of emotions
Conclusions
• Solution to Cold Start Problem
– It takes time to collect users’ listening habits.
– Adding tags is easily done
– Tags look like word-of-mouth
• Performance Enhancement
– Precision, Recall, F-measure
– Hybrid approach > listening habits-based approach, tag-based approach
Future Work
• Elaborating UniEmotion Ontology
– Emerging Internet Slangs
• Item Selection
– Product Network Analysis Considering Tags
– Analyzing short description

Data science-2013-heekim

  • 1.
    A Unified MusicRecommender System Using Users’ Listening Habits and Semantics of Tags Hyon Hee Kim Department of Statistics and Information Science, Dongduk Women’s University
  • 2.
    Outline • Motivation &Objectives • Overview of the System • Generation of User Profiles • A Unified Music Recommendation • Performance Evaluation • Related Work • Conclusions and Future Work
  • 3.
    Motivation (1/3) • Ina Social Music Site – Music recommendation is essential. – Music recommendation is different from other product recommendation • Explicit information : Rating system • Implicit information : the number of plays • Listening habits-based User Profiling – Cold Start Problem • A new users with little information • A new items with only a few ratings – Data Sparsity Problem • Data is very small compared to needed music items
  • 4.
    Classic rock british pop rock • CollaborativeTagging – A tool for users to represent their preferences about web resources – Users add keywords which are freely chosen by themselves to web resources – Using tag data for user profiling in personalized recommender systems • Tag-based User Profiling – More Easily added tags without listening to music – Semantically meaningful tags Motivation (2/3)
  • 5.
    Motivation (3/3) • Inthe case of last.fm • Factual Tags – 85% of tags – genre, region, instrumentation • Emotional Tags – 10% of tags – opinion, sentiment, mood • Personal Tags – 5% of tags – to organize, to browse, etc.
  • 6.
    Objectives • A NovelApproach to Music Recommendation – Combining listening habits and semantics of tags • Using a Tag Ontology and an Emotion Ontology – UniTag: Resolving semantic ambiguity of tags – UniEmotion: Assigning weighted values to the emotional tags → Semantically Enhanced Music Recommendation
  • 7.
    Outline • Motivation &Objectives • Overview of the System • Generation of User Profiles • A Unified Music Recommendation • Performance Evaluation • Related Work • Conclusions and Future Work
  • 8.
  • 9.
    Outline • Motivation &Objectives • Overview of the System • Tag-based User Profiling – Preprocessing of tags – Algorithms for generating user profiles – Preliminary experimental results • A Unified Music Recommendation • Performance Evaluation • Related Work • Conclusions and Future Work
  • 10.
    Preprocessing of Tags(1/3) • A tag does not have any pre-defined term or hierarchies of a term • Problems of tag data – Synonymy • Different words represents the same meaning • E.g., hiphop, hip-hop, hip hop/ R & B, Rhythm and Blues, Blues – Polysemy • A single word contains multiple meanings • E.g., French => French rock, French pop, French artist – Spelling variants • misspelling • Foreign language
  • 11.
    Preprocessing of Tags(2/3) • Tag Ontology – Tags, users, items • UniTag Ontology – uniTag:Users • uniTag:userID, uniTag:hasAdded, uniTag:hasAddedTo – uniTag:Items • uniTag:itemID – uniTag:Tags • uniTag:tagID, uniTag:tagName, uniTag:RTag, uniTag:subTag, • uniTag:Rtags {rock, hiphop, electronic, metal, jazz, rap, funk, folk, blues, reggae} • uniTag:classifiedAs, uniTag:isKindOf, uniTag:istheSameAs, uniTag:tagVariation
  • 12.
    Preprocessing of Tags(3/3) • Rules for reasoning prefix – French rock, progressive rock, post rock=> rock (Tag (?t) ^ tagPrefix (?t, ?p) ^ Prefix(?p) ^ subTag(?t, ?s) ^ Rtags (?s) -> classifiedAs (?t, ?s) • Rules for reasoning expert knowledge – Soul => rhythm and blues, rhythm and blues => blues then Soul => blues (Tag (?t) ^ isKindof (?t, ?A) ^ isKindof (?A, ?B) -> isKindof (?t, ?B) • Rules for reasoning synonym – Hip-hop, hiphop => hip hop (Tag(?t) ^tagVariation (?t, ?R) ^ istheSameAs (?t, ?s) -> tagVariation (?s, ?R)
  • 13.
    Algorithm for GeneratingUser Profiles (1/2) Algorithm 1. Generation of A Tag-based Profile Input: set of Representative tags Tr, set of a user’s tag Tu Output: set of frequencey for each representative tag of the user FTr var RTags[] = {rock, hiphop, electronic, metal, jazz, rap, funk, folk, blues, reggae} var tagFrequency[] = { }, tempFrequency [] = { } var RTag = null while ∃next tag t in Tu do RTag = FindRTag (t) If Rtag == RTags [i] then { tempFrequency[i] = tempFrequency[i] + 1 tagFrequency [i] = tempFrequency [i] } else tagFrequency [i] = tempFrequency [i] endwhile rock hiphop electronic metal jazz rap funk folk blues reggae user1 6 2 2 3 2 4 3 1 1 1 user2 5 0 0 0 0 0 0 0 1 0 user3 2 2 1 1 1 1 2 0 0 1 user4 10 1 0 1 2 0 2 3 3 1 user5 1 4 0 0 0 4 1 0 0 0 Table 1. An example of tag-based profiles
  • 14.
    Algorithm for generatingUser Profiles (2/2) Algorithm 2. Generation of A Track-based Profile Input: set of tracks of a usr TRu, set of Representative tags Tr Output: set of number of a user’s tracks for each representative musical genre Tn var RTags[] = {rock, hiphop, electronic, metal, jazz, rap, funk, folk, blues, reggae} var numTrack[ ] = { }, tempnumTrack [ ] = { } var RTrack = null while ∃next tag t in Tu do RTrack = FindGenre (t) If Rtrack == RTags [i] then { tempnumTrack [i] = tempnumTrack[i] + 1 numTrack[i] = tempnumTrack [i] } else numTrack [i] = tempnumTrack [i] endwhile rock hiphop electronic metal jazz rap funk folk blues reggae User1 65 176 5 4 0 168 0 3 0 0 User2 411 8 11 109 3 5 8 1 0 0 User3 157 7 11 10 6 2 1 39 4 2 User4 257 20 9 18 2 5 0 9 0 0 User5 110 277 15 8 6 85 10 3 2 7 Table 2. An example of track-based profiles
  • 15.
    Preliminary Experimental Results(1/3) • 1,000 user data set from Last.fm – Users, tags, music items • Standardization – To remove extensive preference • K-Means clustering algorithm – Canopy Clustering – 6 centroid points and 6 clusters
  • 16.
    Preliminary Experimental Results(2/3) X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 Cluster1 0.241 1.472 0.626 0.130 1.267 1.621 2.168 0.274 1.078 0.381 Cluster2 2.171 0.032 0.517 3.052 0.011 -0.030 0.328 1.533 1.245 0.162 Cluster3 -0.206 -0.273 -0.517 -0.178 -0.180 -0.294 -0.233 -0.171 -0.204 -0.136 Cluster4 -0.341 0.660 -0.459 -0.284 -0.208 1.178 -0.179 -0.321 -0.166 0.273 Cluster5 -0.074 -0.155 1.320 -0.230 -0.115 -0.261 -0.209 -0.070 -0.172 -0.071 Cluster6 2.815 7.640 5.168 -0.136 9.254 6.135 7.000 4.286 4.421 5.254 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 Cluster1 -0.411 0.495 0.406 -0.338 1.565 0.131 1.632 -0.135 0.147 0.812 Cluster2 0.200 -0.444 0.007 -0.341 0.907 -0.468 -0.288 2.617 1.097 0.020 Cluster3 -0.897 1.651 -0.539 -0.442 -0.213 1.836 0.059 -0.507 -0.415 0.034 Cluster4 1.925 -0.590 -0.404 0.852 -0.264 -0.491 0.655 -0.002 2.850 -0.108 Cluster5 0.914 -0.557 -0.216 0.794 -0.296 -0.511 -0.297 0.014 -0.157 -0.147 Cluster6 -0.472 -0.327 0.380 -0.373 -0.184 -0.371 -0.241 -0.205 -0.300 -0.093 Table 3. Values of Centers of Tag-based Profiles Table 4. Values of Centers of Track-based Profiles • Clustering Validity – Inter-cluster distances – Distances between all pairs of centroids using cosine distance measure
  • 17.
    Preliminary Experimental Results(3/3) – T-test • Mean of inter-cluster distances of tag-based profiles • Mean of inter-cluster distances of track-based profiles N Mean Std Dev t p-value Tag-based profiles 15 0.8325 0.6834 2.55 0.0165 Track-based profiles 15 0.3785 0.0885 Table 5. T-test result for the means of inter-cluster distances
  • 18.
    Outline • Motivation &Objectives • Overview of the System • Generation of User Profiles • A Unified Music Recommendation – UniEmotion Ontology – Generation of User Profiles – Music Recommendation Algorithm • Performance Evaluation • Related Work • Conclusions and Future Work
  • 19.
  • 20.
    UniEmotion Ontology (2/5) P:0.625, O: 0.25, N: 0.125 P: 0.375, O: 0.625, N: 0 P: 1.0, O: 0, N: 0 • Definition of the intensity of emotional tags • SentiWordNet, http://sentiwordnet.isti.cnr.it/
  • 21.
    UniEmotion Ontology (3/5) •Intensity of emotional tags – Strong • Positive value >= 0.75 or Negative value>= 0.75 – Middle • 0.25 <= Positive value <= 0.75 or • 0.25 <= Negative value <= 0.75 – Weak • Positive value < 0.25 and Negative value < 0.25
  • 22.
    UniEmotion Ontology (4/5) •Assigning the weights to the tags – Factual tags: 1 – Positive tags • Strong: 2.5 • Middle: 2 • Weak: 1.5 – Negative tags • Strong: -2.5 • Middle: -2 • Weak: -1.5 • Final score of an item => sum of the weights
  • 23.
    UniEmotion Ontology (5/5) •Two classes – UniEmotion:Positive • Emotional tags belonging to the positive emotional categories • trust, surprise, anticipation, and happiness – UniEmotion:Negative • Emotional tags belonging to the negative emotional categories • disgust, anger, fear, and sadness • Two properties – UniEmotion:Intensity • Specifying the intensity of tags – UniEmotion:Weight • Specifying the weight of tags
  • 24.
    Generation of UserProfiles (1/2) 1. Listening habits-based User Profiles – U1 = {u1, u2, …, um}, I1 = {i1, i2, …, in}, – <u, I, n> • N: number of plays 2. Tag score-based User Profiles – U2 = {u1, u2, …, um}, I2 = {i1, i2, …, in}, – <u, I, s> • S: scores of tags assigned by UniEmotion ontology 3. Hybrid User Profiles – U3 = {u1, u2, …, um}, I3 = I1 ∩ I2, – <u, I, m> • M = α * n +(1- α) * s; α = 0.5
  • 25.
    Generation of UserProfiles (2/2) 1. Listening habits-based User profiles 2. Tag score-based User profiles 3. Hybrid User profiles
  • 26.
    Music Recommendation Algorithm(1/2) • Finding Similar Users – Pearson Correlation Similarity • Calculating scores of items – Considering the similar users’ rates • Recommending top n items
  • 27.
    Music Recommendation Algorithm(2/2) Input: a set of user profiles UP Output: a set of recommended items RI 1. For all yi ∈ U Compute a similarity s between X and yi. 2. Sort by similarity 3. Select top n neighbors 4. 5. For all Compute a similarity t between x and For all preference +=t * pref 6. Rank by preference 7. Select top n items
  • 28.
    Outline • Motivation &Objectives • Overview of the System • Generation of User Profiles • A Unified Music Recommendation • Performance Evaluation • Related Work • Conclusions and Future Work
  • 29.
    Performance Evaluation • ImplementationEnvironment: Apache Web Server – User database : MySQL 5.0 – Listening habits collector, tag score generator: PHP – Recommendation Engine: Apache Mahout – UniTag and UniEmotion Ontology: JDK6.0 • Experimental Data – 1, 000 user information from last.fm [http://mir.dcs.gla.ac.uk/] – Containing 18,700 artist and 12,600 tags – 70% training data, 30% test data
  • 30.
    Performance Evaluation • EvaluationModel – Recommended items • Items which users are interested in (True Positive, TP) • Items which users are not (False Positive, FP) – Items which are not recommended • Items which users are interested in (False Negative, FN) • Items which users are not interested in (True Negative, TN) – Precision P = TP/ TP+ FP • # of correct recommendation/# of all recommended items – Recall R = TP / TP+FN • # of correct recommendation/# of preferred items – F-measure F = 2* P* R / P+R • Harmonic average between precision and recall
  • 31.
    Experimental Results (1/3) •Precisions [Number of similar users] [Number of recommended items] A: Listening habits-based approach B: Tag-based approach C: Hybrid approach
  • 32.
    Experimental Results (2/3) •Recalls [Number of similar users] [Number of recommended items] A: Listening habits-based approach B: Tag-based approach C: Hybrid approach
  • 33.
    Experimental Results (3/3) •F-measure [Number of similar users] [Number of recommended items] A: Listening habits-based approach B: Tag-based approach C: Hybrid approach
  • 34.
    Statistical Validation • One-wayANOVA about three groups – Method1: listening habits-based approach – Method2: tag-based approach – Method3: hybrid approach • Tukey Multiple Comparison Test – Asymmetric distributions • Log transformation – Different characters in case two groups have significant difference
  • 35.
    Method 1 23 F Mean of log(prec) -3.962B -4.036B -2.879A 34.27*** Mean Precision(SD) 0.020 (0.006) 0.020 (0.009) 0.068 (0.040) N 24 24 24 Method 1 2 3 F Mean of log(recall) -3.285B -4.099c -2.635A 26.80*** Mean Recall (SD) 0.044 (0.023) 0.019 (0.010) 0.093 (0.056) N 24 24 24 <Table1. test for precision> ***: p<0.001 <Table2. test for recall> ***:p<0.001 Method 1 2 3 F Mean of log(F-measure) -3.748B -4.117c -2.894A 41.31*** Mean F-measure (SD) 0.024 (0.006) 0.018 (0.008) 0.06 (0.034) N 24 24 24 <Table2. test for F-measure> ***: p<0.001
  • 36.
    Related Work • MusicBox –A personalized music recommender system based on social tags – 3-order tensors model – The method improves the recommendation quality • Foafing the music – Collecting music information in a semantic web environment – User information, music information, concert information – Recommendation of similar music items • OntoEmotions – An ontology of emotional categories covering the basic emotions – Armeteo art portal – New relations can be inferred by reasoning on the ontology of emotions
  • 37.
    Conclusions • Solution toCold Start Problem – It takes time to collect users’ listening habits. – Adding tags is easily done – Tags look like word-of-mouth • Performance Enhancement – Precision, Recall, F-measure – Hybrid approach > listening habits-based approach, tag-based approach
  • 38.
    Future Work • ElaboratingUniEmotion Ontology – Emerging Internet Slangs • Item Selection – Product Network Analysis Considering Tags – Analyzing short description