All data from THE Metal Archives. Web page genereated with R.
This page is about the so called top rated or highest reviewed albums by the MA users. They are really way too famous (and too mainstream for my taste) to make them a list. But anyway.
Trying to find the most favored album, I scraped 50000+ review pages across the site and analyzed them with statistical methods. The result? well that depends on how you evaluate the data. Models with different weight may lead to entirely different conclusions. Also the extremely small sample size of the reviews on an individual album and the heavily skewed distributions of the ratings in general can be limiting factors when applying statistical techniques. So I tend not to offer ONE definite album name. But it's not Hades Archer's Penis Metal at least.
3 parts of work I've done:
- Fetching the data from the review pages, building a dataset and applying "comparing mean techniques" (pairwise Mann Whitney U Test principally) on the data in order to rank the reviewed albums and make a top list out of them (see the list at the bottom of the page).
- Word mining based on the top list dataset, which can be seen as a subset of all data I get, in order to see how the most popular albums have been reviewed exactly.
- Sampling the top list dataset further in order to visualize them.
But stuff I write here will not be confined to this order. The analysis details that I used to select and rank the albums will (possibly) be added later, if I can somehow motivate myself to write about that..
Start with a glance
The bubble chart shows how a random sample of reviewers (sample size = 2000) rate the albums through time. The plot sizes indicate the number of words in an individual review, while the colors differentiate the bands’ country of origin (European-red, American-black, Oceanian-beige, Asian-lila, in descending order of the number).
It seems the users really like to heap their reviews on the top. Even i tried so hard to adjust the parameters, the plots still overlap heavily. Understandable though. We might get more motivated to write a review about the albums we like than about stuff that we find generic after all.
Look into the reviews
Wanna know what they say exactly? Let’s do some simple word mining.
How many words are there in the dataset?
## [1] 1103297
That’s quite a few.
But how many unique words?
## [1] 20599
0.9998133% of all words are repeated..
How many if you exclude stuff like “a”, “the”, “and”, blabla?
That means to filter out prepositions, conjunctions, pronouns and articles.
## [1] 10161
What are the words they used frequently?
## # A tibble: 10,159 x 2
## . freq
## <chr> <int>
## 1 metal 6796
## 2 like 6534
## 3 album 5342
## 4 more 4522
## 5 all 4126
## 6 sound 3765
## 7 well 3549
## 8 first 3048
## 9 best 2853
## 10 very 2830
## # ... with 10,149 more rows
Or even try filtering harder?
## # A tibble: 3,000 x 2
## . freq
## <chr> <int>
## 1 best 2853
## 2 heavy 1498
## 3 hard 894
## 4 perfect 738
## 5 thrash 663
## 6 dark 652
## 7 riff 697
## 8 vocal 630
## 9 classic 616
## 10 evil 480
## # ... with 2,990 more rows
Read some random words?
## [1] "vodka" "principal" "clumsy" "junkie"
## [5] "grass" "dramatically" "vitriolic" "delivery"
## [9] "arsenal" "traffic" "pantheistic" "genuine"
## [13] "roman" "patrick" "divergence" "alma"
## [17] "undiscoverable" "stud" "endeavor" "peter"
## [21] "silver" "kirk" "ensemble" "afterwards"
## [25] "dual" "unexplainably" "fission" "thin"
## [29] "transom" "gem" "pocket" "fantasy"
## [33] "cryptic" "defiantly" "deliberate" "violin"
## [37] "advised" "infancy" "compassion" "observant"
## [41] "anthony" "suspect" "eccentricity" "pusillanimous"
## [45] "usurpation" "dialogue" "scene" "commonly"
## [49] "delicacy" "shortcoming" "dogs" "replay"
## [53] "som" "female" "scary" "debut"
## [57] "liege" "palisade" "fittingly" "clean"
## [61] "isolation" "frigging" "friday" "light"
## [65] "queue" "ira" "elvis" "headless"
## [69] "rapidly" "maiden" "inflated" "bloodline"
## [73] "trusting" "integration" "nevermore" "craft"
## [77] "headed" "vulnerable" "inheritance" "catholicism"
## [81] "tony" "analog" "anguish" "painted"
## [85] "outlier" "funk" "immaculate" "stray"
## [89] "dennis" "bloodshed" "computer" "exceedingly"
## [93] "rocky" "averse" "overabundance" "hyperactive"
## [97] "climax" "continually" "orbit" "closer"
## [101] "unsurprising" "celebrated" "microphone" "wrest"
## [105] "unorthodox" "willing" "iced" "mandarin"
## [109] "clip" "battlefield" "reward" "rome"
## [113] "amon" "global" "eloquent" "frankly"
## [117] "analogy" "rex" "chillingly" "inescapably"
Kirk? grass? vodka? Good words my fellow devil worshipers.
Show them in a picture
Frequency will be shown when hovering the word.
hmmm… not especially surprising.
(but if you look carefully into the tiny ones, you might find something interesting...)
Smth else about reviews
What’s the earliest date in the dataset?
## [1] "2002-07-14"
And what’s the lateset?
## [1] "2021-06-27"
How many words are there in one piece of review in general?
## [1] 3735.419
How many words are there in the shortest and the longest review respectively?
## [[1]]
## [1] 735
##
## [[2]]
## [1] 31195
I think I’d check out that 31195-word-long review later.
Some other computing..
Who are the scene darlings?
Let’s see who have the most albums among, say, top 500.
## # A tibble: 486 x 2
## Band Album_n
## <chr> <int>
## 1 Black Sabbath 10
## 2 Bathory 9
## 3 Blind Guardian 9
## 4 Iron Maiden 8
## 5 Judas Priest 8
## 6 Amorphis 7
## 7 Death 7
## 8 Helloween 7
## 9 Moonsorrow 7
## 10 Neurosis 7
## # ... with 476 more rows
So Black Sabbath top the list with 10 albums. And Bathory, Blind Guardian…
What are the most beloved genres then?
## # A tibble: 291 x 2
## Genre Album_n
## <chr> <int>
## 1 Death Metal 52
## 2 Black Metal 37
## 3 Heavy Metal 32
## 4 Thrash Metal 32
## 5 Melodic Death Metal 18
## 6 Power/Speed Metal 15
## 7 Heavy/Doom Metal 13
## 8 Doom Metal 11
## 9 Progressive Metal 11
## 10 Power Metal 10
## # ... with 281 more rows
Death metal and black metal turn out to be our favorite. Since I just keep the "genre" as it is, you might notice "Death Metal" and "Melodic Death Metal" are counted as two genres here. It's safe to say the number will rise if I merge these subgenres like “Brutal”, “Post-”, “Atmospheric” and this or that or whatever, let alone there are quite a few bands whose genre are something like "Trve Black Metal (early); Folk/Dark Basement Ambient (mid); Extreme Experimental Punk/Disco (later)".
Which countries provided us most albums we like?
## # A tibble: 37 x 2
## Country per(%)
## <chr> <int>
## 1 United States 34.4
## 2 Sweden 10.6
## 3 United Kingdom 8.0
## 4 Norway 7.4
## 5 Germany 6.4
## 6 Finland 4.8
## 7 Canada 4.2
## 8 France 2.6
## 9 Australia 2.0
## 10 Switzerland 1.8
## # ... with 27 more rows
??? 34.4% Who would have thought!
What are the most “controversial” albums?
To examnie how the opinion diverges on a certain album, you should look at the the sd value (standard deviantion) of the ratings. “In statistics, the standard deviation is a measure of the amount of variation or dispersion of a set of values,” as Wikipedia puts it. “A low sd indicates that the values tend to be close to the the expected value of the set, while a high sd indicates that the values are spread out over a wider range” bla bla… Based on this concept, a high sd can probably be a sign of dissension within the reviewers, although the album they reviewed may have relatively high rating on average.
## # A tibble: 263 x 6
## Band Album Reviewer Rating.mean Rating.sd Rating.range
## <chr> <chr> <int> <dbl> <dbl> <chr>
## 1 Velvet Cacoon Genevieve 17 81.9 29.7 0~100
## 2 Death The Sound of Perse~ 29 77.2 29.4 3~100
## 3 Agalloch Marrow of the Spir~ 17 80 29.2 17~100
## 4 Gorguts Obscura 24 81 27.8 0~100
## 5 Ihsahn The Adversary 15 83.3 27.4 0~100
## 6 Electric Wiz~ Dopethrone 17 81 27.2 0~100
## 7 Metallica Master of Puppets 38 80.3 27.0 0~100
## 8 Wormed Planisphærium 16 81.3 26.8 6~100
## 9 Celtic Frost Monotheist 23 85.6 26.0 5~100
## 10 Opeth Damnation 17 83.5 25.7 19~100
## # ... with 253 more rows
Rather surprising that Death’s The Sound of Perseverance pops up. The album has 29 reviewers whose ratings ranged from 3 to 100, with 29.418 being the sd of the rating values. Opinion really seems divided here.
What are the most “universially acclaimed” albums?
Same concept, measured contrarily.
## # A tibble: 263 x 6
## Band Album Reviewer Rating.mean Rating.sd Rating.range
## <chr> <chr> <int> <dbl> <dbl> <chr>
## 1 Entombed Left Hand Path 23 98.7 2.05 94~100
## 2 King Diamo~ Abigail 18 97.2 4.22 84~100
## 3 Symphony X The Divine Wings of ~ 17 96.5 4.68 83~100
## 4 W.A.S.P. The Crimson Idol 16 96.4 6.23 75~100
## 5 Terrorizer World Downfall 15 96.3 7.22 73~100
## 6 Rush Moving Pictures 15 96.2 3.90 87~100
## 7 Bathory Under the Sign of th~ 18 95.7 5.65 80~100
## 8 Katatonia Dance of December So~ 22 95.5 7.15 72~100
## 9 Type O Neg~ October Rust 15 95.3 7.23 78~100
## 10 Atheist Unquestionable Prese~ 20 95.2 6.06 80~100
## # ... with 253 more rows
23 reviewers of Entombed’s Left Hand Path were literally unanimous in their approval of the album, settling their ratings at around 98 out of 100. I’ve added the album to my to-revisit-list actually. (Edited: why they liked it that much? I don't get it.)
Where are my personal favourite albums?
## [1] They are too obscure to be shown here.
OK I faked. But only a couple of them and they don't rank high.
So top 500. Enjoy.
Updated: someone criticized me because the list didn't (and still doesn't) include blabla band's blabla album. i think I've already explained this list was generated based on MA users' reviews? The results are perhaps biased, but they don't represent my own taste. Can't you see my favorite albums ain't on the list either? And can't you see how scarce the reviews are on the site? Instead of blaming it on my "useless statistics" I think it's better to log yourself in on MA and write as many reviews as possible in all your free time. 🤷
Anyway, you can still use the list to check out old/famous stuff you'd possibly like but somehow missed.
Other suggestions like to categorize the list based on time and regions are under consideration and might be added later.
Updated[2]: filters added.
Updated[3]:
New approach to rank the album adjusted for cohort effect (time period), which will give relatively more weight to later albums that have high ratings but comparatively fewer reviews .
Previously the unadjusted lists are predisposed in favor of albums with more reviews, which makes less sense when comparing albums widely separated in release dates, since the amount of reviews naturally accumulated over time.
The new algorithm attempts to compensate this structural influence, but to be honest the issue is all too complicated, given that many factors such as the growth of MA users, the increase in marketing hype by some labels, or the decline of interests in irrelevant (old) releases might also have an impact on the number of reviewers sharing their idea.
Which way will be better? Who knows. Perhaps I should not be so serious because MA reviews are criminally arbitrary......
Also some other small changes that are not worth explaining.
Click here for the list dedicated to black metal albums.
Rating Min and Max: the lowest rating and the highest rating.
Rating Mean: the average of all the ratings.
Rating Sd: the standard deviation of the ratings, a measure of how variable the ratings are.
Region:
Year:
Reset: