As finally it is now possible to create or modify a
microcode fully compatible with the Nintendo SDK, I started considering what
would be the best approach to tame further the N64 RSP.
For the time being it is quite obvious that starting from
scratch an entire new graphic microcode is still beyond my skills so I decided
that optimizing and enhancing the original Fast3D microcode would be a good
exercise.
So for the last two months, I changed a bit of the Fast3D
source code here and there and learnt various things about the organization of
code and the way to program the RSP.
It quickly appeared that the microcode was written with a
very theoretically approach, meaning without having in mind to optimize the
limited resources of the RSP.
Few ideas came also to me about what could be implemented
differently and from there I set up finally a high level plan.
The philosophy behind it would be as follow:
1. Save a maximum space in IMEM/DMEM with little impact on
the performances.
2. Implement unavailable features to match them with some slightly
more "modern" versions of OpenGL from which Fast3D seems somehow to
be inspired from.
After some investigations, I came up with the following:
1. Texture Rectangle RDP command being composed of 4 words,
it would be normal that such command is included in the display list without
being splitted into 3 commands as in Fast3D.
2. In order to save DMEM space, either get rid of some
constants (i.e opengl offset, Newton's iteration constants, dram address mask,
"reserved" DMEM space, include DMEM address directly into GBI
commands rather than loading them from a DMEM address etc.) or reduce their
size (i.e segments, light data structure, etc).
3. Implement an optional double DMA buffer for some
immediate commands. Data would be retrieved from RDRAM where RSP would deal
with another DMEM buffer.
4. Implement a circular buffer to store transformed vertex
and generate strip triangle, with a primitive restart function. It would be
possible to also use triangle list, triangle fans or where possible quads.As
the vertex buffer would be reduced dramatically in size, the number of vertices
contained in the buffer should be normally high. It would still be good where
possible to keep indexed triangle implementation where possible.
5. Save IMEM used as some commands could be implemented
differently (set/cleargeometry mode, moveword, movemem, etc.)
For a first customization, it seems ambitious enough but
having done various tests, I clearly hope that it can be achieved successfully.
Next step will be much more concrete, directly into the
intricate details of the microcode.
Finally feel free to share other ideas you may have as comments,
yet please understand that no extreme severe modifications are to be done (i.e.
Z sorting, involvement of CPU in the tasks, etc.). Thanks!
Aucun commentaire:
Enregistrer un commentaire