Pages: [1]
Tom_unoduetre
    Donator
Tester
BAM!ID: 1183
Joined: 2006-05-31
Posts: 57
Credits: 9,948,549
World-rank: 43,560

2012-05-23 08:47:23

Hi Willy,

I think after the Boincstats Update I´m not able to give points to projects anymore for the project popularity ranking or am I just blind?

And can you hide the retired projects from that list? I obviously cannot speak for all users here but I personally don´t need to know that retired project X was ranked 55 at some point in time.

Many thanks
[BOINCstats] Willy
 
Forum moderator - Administrator - Developer - Tester - Translator
BAM!ID: 1
Joined: 2006-01-09
Posts: 9179
Credits: 349,821,048
World-rank: 3,274

2012-05-23 09:02:46

The ranking is now based on the number of hosts attached to a project and the resource shares assigned. This should show popular a project is, unpopular projects won't have many attached hosts.

It may need some work though.
Please do not PM, IM or email me for support (they will go unread/ignored). Use the forum for support.
Tom_unoduetre
    Donator
Tester
BAM!ID: 1183
Joined: 2006-05-31
Posts: 57
Credits: 9,948,549
World-rank: 43,560

2012-05-23 10:25:56

Thanks for the explanation Willy.

So the ranking in the project popularity should (more or less) be comparable to the overall project stats page?

If I take a project like RNA World I see a big difference, in the popularity it ranks quite low at 76, whereas it should rank somewhere in the middle of the field according to the number of attached hosts.

This is really not a big issue (at least for me) but maybe important for stats lovers
[BOINCstats] Willy
 
Forum moderator - Administrator - Developer - Tester - Translator
BAM!ID: 1
Joined: 2006-01-09
Posts: 9179
Credits: 349,821,048
World-rank: 3,274

2012-05-23 10:52:38

To be clear: only hosts which are using BAM! are used for calculating popularity! Not ALL the hosts you see in the host stats.
Please do not PM, IM or email me for support (they will go unread/ignored). Use the forum for support.
Nuadormrac
BAM!ID: 75286
Joined: 2009-09-15
Posts: 56
Credits: 15,389,584,632
World-rank: 225

2012-08-06 07:41:43
last modified: 2012-08-06 07:59:00

[BOINCstats Willy wrote:

The ranking is now based on the number of hosts attached to a project and the resource shares assigned. This should show popular a project is, unpopular projects won't have many attached hosts.

It may need some work though.


This brings up an interesting question, because if I'm understanding this correctly, then this brings up a bit of a technical difference on how this might rate things as distinct from actual production. I'm taking it that essentially we're looking at something along the lines of

Popularity == number_hosts * resource_share_per_host

though the actual calculations might be a bit more complex then this. The reason I mention this, is all hosts aren't equal, varying from the number of CPUs and/or GPUs present, and also on their potential output. The significance here is for instance my old laptop has

- core 2 duo, 1.7 GHz
- 2 GB RAM
- Intel 965 graphics (aka nothing useful for GPU crunching)

However my new laptop has

- i7 quad core, 2.3 GHz, gen 3 (aka Ivy bridge)
- 8 GB RAM
- nVidia GT 650m, 1 GB gddr5 vid RAM, aka Keppler part, hence capeable of GPU crunching

Obviously it's able t do a lot more, not only because of having 8 CPU cores as seen by Windows and BOINC (the CPU also supports hyper-threading), themselves running at a higher clock, but also due to a vastly improved instruction set, going from the core2, up to the core i proc of the third generation (aka Ivy bridge). The GPU, no useable GPU vs having one, the hosts aren't equal. But if one looked at something along the lines of the formula suggested above, popularity would then treat these 2 hosts as equal, so that putting 2 projects on the old comp, and 2 on the newer comp, the suggested popularity from the contribution of each would be the same. But then again, their output, and the credits they generate would not be the same.

So, is any weight given to the hosts potential output? If one had a main shrubber, and a bunch of old Athlon 64 single core CPUs, Pentium IIs, and Pentium IIIs sitting around, and installed BOINC on it, and then set 10 Pentium IIs to crunch one project, while putting their new computer on another, would this actually be making the project on the farm of Pentium IIs more popular? But then what if in a given day, all the WUs complete by the person's Pentium II farm, was still less then their own output on their main machine, and the slew of Pentium IIs contributed to an extremely small portion of the user's own total RAC. Should the projects running on a bunch of outdated and excedingly slow, old hosts be counted as more popular? In a way, yes it's run on more machines, but where a single person owns all those machines, if they wanted to contribute more to it, wouldn't they arguably also put their better machine on it? Now as counter argument, one might ask if 2 people, each with one computer, assign their resource shares, does the person with an older computer not putting more out, make it less popular? But then again where people have more then 1 computer on a single account, the actual credit breakdown wrt how they assigned things accross each of their hosts (where they are unequal) doesn't seem completely irrelavent to which projects they might care to crunch more, for...
noderaser
 
BAM!ID: 13859
Joined: 2006-12-03
Posts: 827
Credits: 170,468,995
World-rank: 5,477

2012-08-07 06:22:17

I think the idea is to rate a project based on how much people like it, assuming that well-liked (popular) projects will have more hosts attached, and people will allocate a higher resource share to those projects. This is independent of how much computing power a project has, there are other stats for that.
Nuadormrac
BAM!ID: 75286
Joined: 2009-09-15
Posts: 56
Credits: 15,389,584,632
World-rank: 225

2012-08-07 18:23:04
last modified: 2012-08-07 18:51:54

But on accounts where an individual cruncher has more then 1 computer, devoting their less powerful computer to crunching it doesn't mean that it would be more popular with them. In essence, if one looks at resource share accross all computers to a single user ID, the resource share would lower for that user.

Think about for instance the situation where my credit return per day was not even 1/10th of what it is now. If one looks at my credits as shown on this site here, the striking difference can be obvious. Last month I got a new computer. Before it, I couldn't have hoped to get anywhere close to 140,000-150,000 credits per day, and given youo (our PotM) isn't the highest paying project, it could go up a bit more (POEM is pretty good, also PotM for GPU, and that does seem to be among a top payer now). On the old host, I could not have hoped to even get 14,000-15,000 a day, hell I couldn't get half. To give an idea, though this wasn't the first computer upgrade), the credits I had a month ago were only 980,000, those credits were cumulative from 2004. The entire credit difference is the new computer here being added to the old, but it is just one more computer, not several... That's the difference in resource contribution I can decide upon when deciding to add something to one host but not the other...

OK, I won't keep it as such, but for now, I'm running both computers, and I don't have the same projects on each. But the new computer can put out 10, perhaps 20 or 25x more work per day. It can crunch 8 tasks at once (dual threaded quad core), not 2 (dual core), and has the GPU.

So, if I set the resource share different on each box (which I do, because I don't attach them to all the same projects), then 50% of the time on the new computer is not devoting the same resources as 50% on the old host. Because the new host has an order of magnitude of resources over the old, aka a i7 quad 2.3 GHz 3610QM proc, vs a core 2 duo 1.7 GHz T2370 proc (the latter number for each being the CPU model munmber). Now, wouldn't it stand to reason, that if I wanted to devote more resources to the new computer, vs the old, where the % resource share would yield a higher contribution of a bigger pool of resources?

What I'm proposing is that instead of looking at resource share accross all total hosts and multiplying it by hosts, that instead one pools all the computers under each user account together, then get a per user resource share (which basically pools each host owned by the user together), then multiplies that value by the number of users. This way, when unequal hosts (in terms of resource share) is combined under a single user account, one can compare how many resources a given user is devoting to each project. An i7 3610QM != a core 2 duo T2370, and when deciding which projects to run on which computer, I'm well aware of that. Actually of late, I've in part been putting a higher paying project on the core 2 duo, because it's computation capability is arguably rubish against that of my new computer (where I'm more likely to run it on stuff I care more about then credit return alone). In this, a resource share per host, is not necessarily equal to a resource share PER USER...

I guess, it would depend if one wanted to see how popular each project was among the machines themselves, or how popular they are with the people who set up and run those computers. Because an aggregated resource share per person (when people can maintain and unequally assign resources from hosts that have unequal resources), would be different then the resource allocation from all of their unequal hosts themselves, on whether we'd agregate all hosts for each user, then calculate out the per user resource share, vs the per host resource share.... For some this would be easy, they have only 1 host, but for others with a farm (of different generation hosts each with different total resources to be divied up), it would be a bit more complicated to aggregate together...
noderaser
 
BAM!ID: 13859
Joined: 2006-12-03
Posts: 827
Credits: 170,468,995
World-rank: 5,477

2012-08-08 06:22:22

At current, the host's computing power is not a consideration--they are all treated equal. It's a bit like how WCG organizes their internal stats and badges by the hours of computing completed, rather than cobblestones. That way, people with fast computers can't skew the popularity calculation. However, someone with a lot of hosts could throw the rankings a bit--but someone who's willing to put down the cash to pay for energy costs, etc. probably has something to say about the project regardless of their actual computing contribution. Also remember that this ranking isn't representative of BOINC as a whole, but only users who are using BAM.

I think throwing in computing stats with the popularity is a rather unfair comparison, since those with deep pockets can throw in very powerful computers, making their ratings of projects have more weight than those with slower computers. And, just because a few powerful crunchers have put in a lot of cobblestones on a project does not mean that a majority of users like it over another project.
Nuadormrac
BAM!ID: 75286
Joined: 2009-09-15
Posts: 56
Credits: 15,389,584,632
World-rank: 225

2012-08-14 17:35:59
last modified: 2012-08-14 17:48:47

My point is that it is a consideration, if you want to find how popular a project is with a given user. OK, I'll give you an example on how this can effect it. I'm a single BOINC user, hence one individual. If I were to be counted, this is how looking at the hosts only could scew the results in trying to figure out how popular something is among the crunchers themselves (aka the people, and not the machines).

- My i7 quad completes Primaboinca tasks in about 2 hours each. It can crunch 8 tasks at a time (the quad core supports hyper threading). This allows it to complete 12 x (24 hours divided by 2 hours each) 8 cores == 96 WUs per day, 100+ resource share.

- My Core 2 Duo takes about 3.5 hours to complete the same WU, or 6.857 WUs per day * 2 cores = 13.71 cores per day. 13.71 WUs completed isn't anywhere's close to 96 for a 100% resource share.

Now, by looking at hosts only, if say I ran Primaboinca 100% on the core 2 duo, but only 50% resource share on the i7, you'd have the project looking more popular on the core 2, even though it's doing < 14 WUs a day. The i7 would have it looking less poplar, but it would complete 48 WUs per day. This would be 3.5x more work done, and a 3.5x difference on my (the user, and not the computer's) contribution.

What I'm suggesting is that popularity among machines (there really is no such thing as popular for a computer, they're things, with no free will, no decision making capability, they run what they're programmed and set to run, they lack the decision making capeability of human beings), is kinda a meaningless figure. What really matters, when we want to know how popular a project is, is how popular they are with people, not with machines. But for the reason such as I suggested above, the difference in computing power could scew the results. You can't tell how popular the project is with me by simply taking a % resource count on the machines themselves (as opposed to on my account), and then treating all machines as equal. But if I were to want to know how popular a project is, I would want to know how popular it is among the user base (aka what other people are deciding), which would hold meaning. Because it would be those who own the machines who are making the decisions and setting up their computers to crunch this or that, not the computers themselves. When all is said and done, the different computers, having their own respective different capeabilities are generally nothing more then a thing, or a tool to accomplish whatever task is put to it. But it, as an object has no preferences (hence where the concept of popularity would come in). It would instead be the users who are using the computers who would have the preference... Is popularity among objects that can't make decisions and show preferences really what one wants to see? Or when someone's asking the question how popular is this or that choice, are they asking this, per those (aka the people) who can show the preference and set things to try to follow their preferences, in the first place?

Even with BAM, I seem to remember that one can set certain machines not to run a given project, further complicating the matter....

Now what I would propos, but I'm not sure if BOINCstats gets the necessary data for this intermediary calculation, or what kind of load it would put on the servers, is this. Instead of figuring out project popularity like this

project_popularity == resource_share_per_host * total_hosts

one looks at

project_popularity == resource_share_per_user * total_users

where resource_share_per_user would be a calculated value that would take into account a user's resource share allocations accross all of their machines. These decisions by people, in terms of cross host resource allocations would then be calculated into the total. I for one would find this far more meaningful, as it would tell me who is running what. If I want to know whether something is popular or not, then what do I really care about? The people or the CPUs? It is the people who set things up, and the people who make the decisions. It's the people who matter. Treating all machines as equal, while not taking into account how people can allocate resources accross different computers with different resources (total) available to them can really muddy up the result. Heck, if I wanted to run the same project on 2 different computers, but for instance wanted the project to get a different resource share from a main host, I could shunt it over to a different venue (like school instead of home) and set a different per-venue resource share, taking into account the machines capabilities when making that decision.

After all, when setting up the machines, I really care about what I, as a person will be contributing to, and might muck with the nuances to crunch more of what I want to run more of. The computer really has no part to play, except to do what I tell it to do This would then represent the total resource share as set by me, in total, rather then per machine. Accumulate this accross users in general, and then one would get an idea of the relative popularity for the projects among all users themselves... By leaving this out, one might in fact end up with results that could be a bit different. AKA, if I ran Primaboinca 100% on my core 2, and 50% on my i7, my actual contribution is not 150% accross 2 machines or 75% of my total resource share to that project. It's actually more then 50% of my resources but < 75% of my total resources I'd be giving to them

noderaser wrote:

At current, the host's computing power is not a consideration--they are all treated equal. It's a bit like how WCG organizes their internal stats and badges by the hours of computing completed, rather than cobblestones. That way, people with fast computers can't skew the popularity calculation. However, someone with a lot of hosts could throw the rankings a bit--but someone who's willing to put down the cash to pay for energy costs, etc. probably has something to say about the project regardless of their actual computing contribution. Also remember that this ranking isn't representative of BOINC as a whole, but only users who are using BAM.

I think throwing in computing stats with the popularity is a rather unfair comparison, since those with deep pockets can throw in very powerful computers, making their ratings of projects have more weight than those with slower computers. And, just because a few powerful crunchers have put in a lot of cobblestones on a project does not mean that a majority of users like it over another project.

Pages: [1]

Index :: BOINCstats general :: Project Popularity
Reason: