Home > front end >  SDL Image display slower then wanted
SDL Image display slower then wanted

Time:08-05

I want to use SDL2 to display a list of rgb values I have in memory to the screen. But my issue is that to update the screen for even a modestly sized image it takes a long time to preform. On my laptop the example code takes 11.8 milliseconds to update the screen. Which would mean if you wanted a 60fps output it would only leave 4.2 milliseconds to do everything else. Don't know what other method to use to display a image.


//gcc -Ofast -g "%f" `pkg-config --cflags --libs sdl2` -o SDL_test_3
#include <SDL2/SDL.h>
#include <stdio.h>
#include <time.h>

//====---The_General_Use_Variables---====
unsigned char uc1;

long l1;

float f1;
float f2;
float f3;
float t1;
float t2;




int main(){
    int sx=1000;
    int sy=700;
    int did_work;
    
    //initilizes SDL
    did_work=SDL_Init(SDL_INIT_VIDEO);
    if (did_work!=0){printf("Issue_in_SDL_Init\n");}

    //Creates a window
    SDL_Window *window=NULL;
    window = SDL_CreateWindow( "win1", SDL_WINDOWPOS_UNDEFINED, SDL_WINDOWPOS_UNDEFINED, sx, sy, SDL_WINDOW_SHOWN );

    //for a window to be on screen you need to be constantly monitering for window events
    SDL_Event windowEvent;
    if (!window){printf("SDL_CreateWindow_failed\n");}

    //assigned the window a surface
    SDL_Surface *screenSurface;
    screenSurface=SDL_GetWindowSurface(window);

    //creates a im_buffer
    unsigned char *im_buffer;im_buffer=malloc(sx*sy*3*sizeof(unsigned char));
    for(l1=0;l1<sx*sy*3;l1  ){im_buffer[l1]=0;}

    //creates a screensurface with the im_buffer painted on it
    SDL_Surface *im_surface = SDL_CreateRGBSurfaceFrom((void*)im_buffer,
            sx,
            sy,
            24,          // bits per pixel = 24
            sx*3,  // pitch
            0x0000FF,              // red mask
            0x00FF00,              // green mask
            0xFF0000,              // blue mask
            0);

    
    //this command copys the im_buffer's window surface to the displayed window surface
    SDL_BlitSurface(im_surface, NULL, screenSurface, NULL);
    SDL_UpdateWindowSurface(window);
    
    //setting up for loop
    int seconds=5;
    t1=clock();
    t1 =CLOCKS_PER_SEC*seconds;
    t2=clock();
    long counter=0;
    
    uc1=0;
    while(t2<t1){
        uc1  ;
        counter  ;
        SDL_PollEvent(&windowEvent);
        
        for(l1=0;l1<sx*sy*3;l1 =3){im_buffer[l1]=uc1;}
        im_surface=SDL_CreateRGBSurfaceFrom((void*)im_buffer,
            sx,
            sy,
            24,          // bits per pixel = 24
            sx*3,  // pitch
            0x0000FF,              // red mask
            0x00FF00,              // green mask
            0xFF0000,              // blue mask
            0);
        
        SDL_BlitSurface(im_surface, NULL, screenSurface, NULL);
        SDL_UpdateWindowSurface(window);
        
        t2=clock();
        }
    f1=counter;
    f2=seconds;
    printf("average_fps=%f \n",f1/f2);
    printf("frametime=%f \n",1000*f2/f1);
}

CodePudding user response:

Most games put their textures into video memory so that the GPU can handle all of the rendering. This will normally give you the most performance, because that is what GPUs are optimized for.

However, in your code, you are updating the surface/texture once per frame. This is bad for performance. Instead of doing this, it is generally better not to modify the textures, or if you do need such visual effects, this should generally be done inside video memory with the GPU, for example by using shaders. The CPU is generally much slower.

In SDL 2.0, an SDL_Surface is a sprite in system memory, which is used by the CPU. If you want to move the sprite to video memory, where it is generally faster and handled by the GPU, you must convert the SDL_Surface to an SDL_Texture, for example using the function CreateTextureFromSurface.

However, when I converted your code to use an SDL_Texture and CreateTextureFromSurface and tested it on Microsoft Windows, I did not get a performance increase. Instead, the frame rate dropped by about 30% compared to your original code. This is because I was calling CreateTextureFromSurface once per frame, because the texture was changing in every frame.

After that, I tried creating a streaming texture, which is an SDL_Texture created with SDL_TEXTUREACCESS_STREAMING. This allows you to use the function LockTexture, which modifies the texture in an efficient manner. By doing this, I was able to improve the frame rate by 46% compared to your original code. Here is my modified code:

#include <SDL.h>
#include <SDL_render.h>
#include <stdio.h>
#include <stdlib.h>

//====---The_General_Use_Variables---====
unsigned char uc1;

long l1;

double f1;
double f2;
Uint64 t1;
Uint64 t2;


int main( int argc, char *argv[] )
{
    //timers for debugging
    Uint64 TimeStart, TimeLockTexture = 0, TimeRenderPresent = 0;

    //define dimensions of window
    int sx=1000;
    int sy=700;

    //initilize SDL
    if ( SDL_Init(SDL_INIT_VIDEO) != 0 )
    {
        printf("SDL_Init failed!\n");
        exit( EXIT_FAILURE );
    }

    //create a window and renderer
    SDL_Window *window;
    SDL_Renderer *renderer;
    if ( SDL_CreateWindowAndRenderer( sx, sy, SDL_WINDOW_SHOWN, &window, &renderer ) != 0 )
    {
        printf("SDL_CreateWindowAndRenderer failed!\n");
        exit( EXIT_FAILURE );
    }

    //create streaming texture
    SDL_Texture *texture = SDL_CreateTexture( renderer, SDL_PIXELFORMAT_RGB24, SDL_TEXTUREACCESS_STREAMING, sx, sy );
    if ( texture == NULL )
    {
        printf( "CreateTexture failed!\n" );
        exit( EXIT_FAILURE );
    }

    //setting up for loop
    Uint64 seconds = 5;
    t1=SDL_GetPerformanceCounter();
    t1 =SDL_GetPerformanceFrequency()*seconds;
    t2=SDL_GetPerformanceCounter();
    long counter=0;

    uc1=0;
    while(t2<t1)
    {
        SDL_Event windowEvent;

        uc1  ;
        counter  ;
        SDL_PollEvent(&windowEvent);

        TimeStart = SDL_GetPerformanceCounter();
        SDL_RenderClear(renderer);
        unsigned char *pixels;
        int pitch;
        if ( SDL_LockTexture( texture, NULL, (void**)&pixels, &pitch ) != 0 )
        {
            printf( "LockTexture failed!\n" );
            exit( EXIT_FAILURE );
        }

        for ( int i = 0; i < sy; i  )
        {
            unsigned char *p = (unsigned char*)pixels   i * pitch;
            unsigned char *limit = p   sx * 3;

            while( p < limit )
            {
                *p   = uc1; //red value
                *p   = 0;   //green value
                *p   = 0;   //blue value
            }
        }
        SDL_UnlockTexture( texture );
        TimeLockTexture  = SDL_GetPerformanceCounter() - TimeStart;

        TimeStart = SDL_GetPerformanceCounter();
        SDL_RenderCopy( renderer, texture, NULL, NULL );
        SDL_RenderPresent(renderer);
        TimeRenderPresent  = SDL_GetPerformanceCounter() - TimeStart;

        t2=SDL_GetPerformanceCounter();
    }

    SDL_DestroyTexture(texture);
    SDL_DestroyRenderer(renderer);
    SDL_DestroyWindow(window);

    f1=(double)counter;
    f2=(double)seconds;
    printf("average_fps=%f \n",f1/f2);
    printf("frametime=%f \n",1000*f2/f1);
    printf("seconds spent writing to locked texture:  %f\n", (double)TimeLockTexture / SDL_GetPerformanceFrequency() );
    printf("seconds spent doing the actual rendering: %f\n", (double)TimeRenderPresent / SDL_GetPerformanceFrequency() );

    SDL_Quit();

    return 0;
}

On my 7 year old Desktop-GPU, this program has the following output:

average_fps=474.400000
frametime=2.107926
seconds spent writing to locked texture:  3.689574
seconds spent doing the actual rendering: 1.298355

However, even using the method described above, it is still quite bad for performance to change the texture in every frame. If I remove the code which changes the texture in every frame, the frame rate increases by about 1200% (i.e. by a factor of 12).

I get a similar performance increase if I reduce the size of the texture to 32*32 pixels and use that texture for filling the entire screen, by stretching the texture. This clearly shows that the bottleneck is updating the (large) texture.

In your question, you wrote:

Which would mean if you wanted a 60fps output it would only leave 4.2 milliseconds to do everything else.

This statement is inaccurate, unless your laptop only has a single CPU core. If you have more than one CPU core, then you can make one CPU core handle the creation of the individual pixels and sending them to the graphics card using SDL, while the other CPU core handles everything else in your program that it not directly related to SDL. This can be done by using multithreading.

It would probably also be possible to use two threads (i.e. two CPU cores) for modifying the texture. However, I doubt that this would be worth the cost of synchronizing the threads. Since you are already achieving the desired frame rate of 60 FPS when using a single thread for modifying every single pixel on the screen and passing them to SDL, it would probably be best to make this thread do this and nothing else, and let another thread do all the other work.

Note that certain SDL functions should only be called by one thread. For example, according to this page from the official libsdl documentation, you should not be calling functions from SDL_render.h from different threads.

  • Related