Archive for the gpu Category

One more video (432*432 number of robots,zombies,eat-them-freaks–whatever)

Posted in GPU Random Number, c++, gpu, programming, project zombie with tags on April 9, 2009 by bey0ndy0nder

Next version will look very different and hopefully with better crowd behavior! I want flying things, destructible environments, fire effects and atmospheric effects, buildings…and definitely some sort of behavior.

BTW, sound clip IS FREE and was taken off some site advertising Resident Evil Apocalypse–this is off google search on FREE Resident Evil music. Also, I’m not making any political or general statement with choice of sound clip.

PLUS, WTF IS WRONG with the quality man??? IT DOES HOWEVER is giving the scene a darker feel to it.

OH noes! Frapping Zombies!!!

Posted in GPU Noise, GPU Random Number, c++, gpu, programming, project zombie with tags on April 8, 2009 by bey0ndy0nder

Finally fixed the issues.

I’m going to spend the next few days going back and document the hell out of everything. Maybe do some resource refactor… then again, maybe not, I want to start working on Spherical harmonics!

BTW, what do I have to do to get some better vid qualities. I encoded it at 3000kb/s… hmmm, maybe I have to increase duration in order for youtube HD to kick-in?

It don’t matter. The video sucks any

I’m back? And WINDOZE still sucks!

Posted in c++, gpu, linux, programming, thoughts with tags , on March 21, 2009 by bey0ndy0nder

Hello,

Am I back? Maybe. I won’t be updating, regularly, for sure. What I need (we) is more actions and results and less bullshyte. What this means: I won’t update, unless I have something really kick-ass; so, after getting this — kick-ass — thing out, I will then update and write about development, like I used to. Until then, adios..

There is the thing about my new computer: I’ve bought an AMD Phenom II X4 system, with ATI (the X2 one) hardware (GASPPPPPP!!!) ATI???

Yes. Even though I think ATI has shitty Linux drivers, but my friends and I decided that we are doing this “coalition of the willing thing,” — as far as computer ‘nerd’ shyt is concerned — where we are supporting the underdog (to further competetion for nerd shyte) — where we’re going gangsta on Intel and Nvidia and showing love to AMD and ATI! (No, my friend didn’t put it like that. I’ve taken liberties to sprinkle the “coaliation of the willing” thing. Cos I’m gangsta like that.)

HAHA. Initially I was not too sure about this choice, since I wanted the best hardware possible (one has to admit that the current Intel Nvidia X2 line kicks ass, performance-wise). But, after evaulating the price/performance ratio, I think AMD Phenom is a great choice. I’m still not complete 100% on the ATI decision, since I’ve been an Nvidia dude forever. But since this is gangsta… we got do gangsta stuff… so ATI it is. (I.E. we’re repping the AMD/ATI clique now)

And lastly… Windows (Vista) STILL SUCKS BALLS!!!!!!!!!!!!!!!! Haha, but I’m downloading Visual C++ express edition as we speak; parse what you will whatever meaning and implications from that :)

(So much hate… oh the humanity)

Project Zombie imposter shader effects

Posted in glsl, gpu, mathematics, programming, project zombie, source code, thoughts with tags , , , on October 15, 2008 by bey0ndy0nder

So I finally fixed some bugs with my imposter renderer. Now it works great. I also put in some simple Phong shading.

I have several options for doing lighting:

1.

Store normal data along with the imposter texture. This is workable, but not memory efficient. This option would look great though. (This may actual work in the end. I still have not thought much about compression schemes for imposter textures).

2.

Chi Ting Lighting ™. Yes, this is what I have implemented, and is probably what I will go with in the end, if option 3 does not work out. Basically, I’m using a sphere to approximate the shape of my imposter object. I sample this sphere based on the texture coordinate (i.e. theta and phi values). So that, every pixel is shaded based on this sphere. Sort of like hemispheric lighting… An improvement to the current method is to compute some factors, such that, using these factors I can get a better sample, e.g. finding bounds for each imposter view angle and map this bound to a best fit spherical theta and phi range. This brings me to option 3…

3.

Use fancy mathematics ™. That is, find some sort of mathematical function that approximates the shape our object, and then sample from this function. Of course, any storage used for this function should be less than required for the normal option. Think spherical harmonics type stuff… I am not saying spherical harmonics will be the solution here, but I take my ideas from that- some special function to represent data.

MD5GPU reloaded (and debugged):

Posted in GPU Noise, GPU Random Number, glsl, gpu, mathematics, programming, project zombie, source code, thoughts with tags , , , , on October 5, 2008 by bey0ndy0nder

It’s working now. I haven’t tested it with DIEHARDER yet, I may do it later, when I have time. But if it looks like white noise, walks likes whitenoise…

BTW, the author’s (of the paper) optimization works fine. Realy, think about it, why wouldn’t it work? It’s still rotating, that’s all matters really.

I’m going to start working on the agent simulation part of PZ.

#extension GL_EXT_gpu_shader4 : enable
//This function initializes the 512bit data according to the MD5 spec.
//Such that, the first 128 bit is the input;
//we also xor these 128 bits with the key, which can act like a seed value.
//And the rest up of the 12 32bit data blocks are filled
//according to the md5 spec, in order to pad our data to 512 bits.
//block 0-3: input xor with key
//block 4: 0x80000000. This correponds to append 1 bit to block 0-4.
//block 5-13: 0. This corresponds to appending zeros up to 448 bit.
//block 14-15: 0x0000000000000080. This correspond to the bit length of the input (128 bit), as a 64bit
//litten endian.
void setupInput(in uvec4 input, in unsigned int key, inout unsigned int data[16])
{
	data[0] = input.x^key; data[1] = input.y^key; data[2] = input.z^key; data[3] = input.w^key; //xor base with key
	data[4] = 0x80000000u;
	data[5] = 0u; data[6] = 0u; data[7] = 0u; data[8] = 0u;
	data[9]=0u; data[10]=0u; data[11]=0u; data[12]=0u; data[13]=0u;
	data[14] = 0x00000000u; data[15]=0x00000080u;
}
//initialize to the 4 hexes.
uvec4 initDigest()
{
	return uvec4(0x01234567u,0x89ABCDEFu,0xFEDCBA98u,0x76543210u);
}
//F compression functions
//(b & c) | ((not b) & d)
unsigned int F0_15(in uvec3 tD)
{
	return (tD.x & tD.y) | ((~tD.x) & tD.z);
}
//(d & b) | ((not d) & c)
unsigned int F16_31(in uvec3 tD)
{
	return (tD.z & tD.x) | ((~tD.z) & tD.y);
}
//b ^ c ^ d
unsigned int F32_47(in uvec3 tD)
{
	return tD.x ^ tD.y ^ tD.z;
}
//c ^ (b | (~d))
unsigned int F48_63(in uvec3 tD)
{
	return tD.y ^ (tD.x | (~tD.z));
}

//return input/(2^32) //2^32 - 1.0 + 1.0
vec4 convertToR0_R1(in uvec4 input)
{

	return output;
}

uvec4 whiteNoise(in uvec4 input,in unsigned int key)
{
	unsigned int data[16];
	setupInput(input,key,data);
	uvec4 rot0_15 = uvec4(7u,12u,17u,22u);
	uvec4 rot16_31 = uvec4(5u,9u,14u,20u);
	uvec4 rot32_47 = uvec4(4u,11u,16u,23u);
	uvec4 rot48_63 = uvec4(6u,10u,15u,21u);

	uvec4 digest = initDigest();
	uvec4 tD;
	uvec4 fTmp;
	unsigned int i = 0u;
	unsigned int idx;
	unsigned int r;
	unsigned int trig; const unsigned int MAXFT = 4294967295; //2^32-1
	//What follows is the unrolled loop from 0 through 63
	//0
	tD = digest;
	unsigned int temp;
	for(;i<16u;i++)
	{
		fTmp = F0_15(tD.yzw);
		idx = i;
		r = rot0_15.x;
		rot0_15 = rot0_15.yzwx;
		trig = truncate(abs(sin(float(i+1)))*float(MAXFT));
		tD.x = tD.y + ((tD.x+fTmp+data[int(idx)]+trig) << r);
		tD = tD.yzwx;

		digest +=tD;
	}
	for(;i<32u;i++)
	{
		fTmp = F16_31(tD.yzw);
		idx = (5u*i + 1u) % 16u;
		r = rot16_31.x;
		rot16_31 = rot16_31.yzwx;
		trig = truncate(abs(sin(float(i+1)))*float(MAXFT));
		tD.x = tD.y + ((tD.x+fTmp+data[int(idx)]+trig) << r);
		tD = tD.yzwx;
		digest +=tD;
	}
	for(;i<48u;i++)
	{
		fTmp = F32_47(tD.yzw);
		idx = (3u*i + 5u) % 16u;
		r = rot32_47.x;
		rot32_47 = rot32_47.yzwx;
		trig = truncate(abs(sin(float(i+1)))*float(MAXFT));
		tD.x = tD.y + ((tD.x+fTmp+data[int(idx)]+trig) << r);
		tD = tD.yzwx;
		digest +=tD;
	}
	for(;i<64u;i++)
	{
		fTmp = F48_63(tD.yzw);
		idx = (7u*i) % 16u;
		r = rot48_63.x;
		rot48_63 = rot48_63.yzwx;
		trig = truncate(abs(sin(float(i+1)))*float(MAXFT));
		tD.x = tD.y + ((tD.x+fTmp+data[int(idx)]+trig) << r);
		tD = tD.yzwx;
		digest +=tD;
	}

	return digest;
}

MD5GPU algorithm implemented (source code)

Posted in glsl, gpu, programming, project zombie, source code, thoughts with tags , , , , , on October 3, 2008 by bey0ndy0nder

So I started implementing the MD5GPU algorithm. It’s pretty straightforward. Matter of fact, I will be brief. (Note, I have not tested the code below)

From the original MD5 algo., we need to break the input into 512 bit chunks. So, we need to first pad the input to a length a, and then add 64 bit to it in order to be able to break our input into 512 bit chunks, such that:

a congurent to 448 mod 512. So we keep on appending 0 to our digest (after we appended a 1 first) so that (a – 448) mod 512 = 0. Since, (note ~ is the equivalence relation)

a ~ b mod n => (a-b) = cn, => (a-b)/n = c, where c is in Z (integers), iff (a-b) mod n = 0. (Maybe in a future post I will talk about how this relates to group theory.)

But for our problem, since the message length is a constant 128 bit, then we know from the get-go that a is 448.

Okay, that’s not all that interesting, since I’m just describing the algorithm. I may post some more on this tomorrow. I’m heading out for the night.

The interesting part is WHY does this thing work? Why does it produce results that passes all the DIEHARD tests? My intitutive understanding of this (but I’m not sure. Never studied crypto) this is due to the combinatoric explosion nature of working with 512 bit chunks. The compression functions are such that a change in ONE single input bit results in change of each output bit with a probability of 1/2. So, it’s like we are ’scrambling’ the input in this 512 bit combinatoric ’space’, which is a huge space… (I’m really sorry if the above bit totally pisses you off due to all the hand waving due to ignorant understanding)

Not sure tho. That’s just my intuition…

What do you think?