Sort Array with Assembly Language Comparison
You will be writing a mixed-language program, mostly implemented in the C language,
that sorts an array of struct books. It will implement the majority of the Selection
Sort algorithm, but will call out to a small subroutine to compare each pair of books,
which you will write in assembly language.
Your program will be executed with an input file which will be read in from standard
input (we will use the shell’s I/O redirection to to this, so you will not have to explicitly
open any files in your code). A large sample data file is available at:
/afs/umbc.edu/users/r/a/ramseyk/pub/cmsc313/fall17/proj4/books.dat
You need to copy this file into your own directory.
Your C program’s first task will be to process the input, filling in an array of struct
book structs in memory. The array will be dimensioned to hold a maximum of 100
books. Your program will read in a line at a time, each line comprising a record of a
single book. It will continue reading records until either
(a) it hits the end-of-file, or
(b) it fills the entire array, whichever comes first. Here are a few sample records from
the file:
Breaking Point, Pamela Clare, Romance, 2011
Vow, Kim Carpenter, Nonfiction, 2012
1491, Charles C. Mann, Nonfiction, 2006
Three Weeks with My Brother, Nicholas Sparks, Nonfiction, 2004
Each of the string fields (title, author, subject) in your struct must be null-terminated-
-that is the reason each is dimensioned as the length limit + 1. So, AUTHOR_LEN, for
example, is the actual maximum number of real (i.e., non-null) characters that can
be in the author’s name, including spaces.
You can use scanf() to read in the fields. Every record is “clean”, meaning all fields
are present and of the right format. The only place where a comma (‘,’) appears is as
a field separator (i.e., it does not appear embedded in a title, author name, etc.). For
most of the fields, you can, and should, read directly into the actual field in the struct
book. However, there is one complication: The title in the file record might be too
large to fit in the space allotted in the corresponding field in the struct. Here is an example of a too-long title:
Harry Potter and the Chamber of Secrets, J.K. Rowling, Fantasy, 2012
This title is 39+1 characters long including the null. We’ve only allocated 32+1
characters for The title in our struct. If you try to read this title directly into the struct’s
field, one of two things will happen:
• fscanf() you will read in more characters than will fit in the field, overrunningthe field, with bad consequences; OR:
• you use the field width specification in fscanf() to stop after a limited numberof characters, in which case the rest of the title will stay in the input buffer,waiting to be (erroneously) read as part of the next field
Neither of those two options are satisfactory. So, you will have to first temporarilyread the title from the file into a buffer large enough to hold all of it (the providedcode implies that the title will never be more than 80 characters long, and you candepend on that), and then only copy a limited number of chars into the actual structfield (make sure you set aside room for the null!).
Once you’ve read in all of the book records, you will sort the books, based primarilyon year of publication, oldest year first. For all books published in the same year, youwill subsort by title, in alphabetic order.
Seethe partial sample output below.
You will sort the entries in the array by implementing Selection Sort, as described inthe Background section above. Since it is a simple array, iterating over it should beeasy. To swap two members, you can take advantage of the fact that struct-to-structassignment copies the entire contents of the struct. Slightly inefficient, compared tousing pointers, but that’s why we chose Selection Sort, to reduce the number ofswaps.
Within the inner loop of your Selection Sort implementation, you need to compare tworecords. To do this, your sort implementation must call the assembly routine bookcmp,which you will implement in bootcmp.asm. Since you have not yet learned how topass parameters in using C’s call-by-value mechanism, you will store pointers to thetwo struct book records you want to compare in global pointers called book1 andbook2.
You will implement bookcmpin assembly code, in the file bookcmp.asm. It will look inthe global variables book1 and book2 for pointers to the two records to be compared.It will return one of the integer values -1, 0, or +1 in the register EAX, depending onwhether book1 is strictly less than, equal to, or greater than book2, respectively.
You will obviously have to implement your C code and assembly code in separate files.
You should call the C file sort_books.c, and your assembly filebookcmp.asm. Theentry point of your assembly subroutine must be called bookcmp. Your sorting functionin sort_books.c must call this bookcmpsubroutine. We have not yet learned how Cpasses its parameters, so for now, your C code should put pointers to the two structbooks to be compared in the global pointers “book1” and “book2”, so that they canbe accessed easily from the assembly code. Thus, one of your first instructions inbookcmpshould be (after any required saving of registers with PUSH):
movebx, [book1]
movecx, [book2]
Now, EBX and ECX will contain pointers to the two struct books to be compared,and you can use indexed addressing modes on these.
At the end of your program, you will print out all of the books in sorted order, in theexact same format as the original input.
When you build your program, you must compile or assemble each source fileseparately:
linux2% gcc -m32 -ansi -Wall -g -c sort_books.c linux2% nasm -f elf -g bookcmp.asm
linux2% gcc -m32 sort_books.obookcmp.o -o sort_books
(We will provide a Makefile template appropriate for this project in the same directoryas sort_books.c.) Then, you can run the executable sort_books.outfile produced. Asample run should look like:
linux2% ./sort_books.out< books.dat
Dead until Dark, Charlaine Harris, Fantasy, 2001
Three Weeks with My Brother, Nicholas Sparks, Nonfiction, 2004
1491, Charles C. Mann, Nonfiction, 2006
Eclipse, Stephenie Meyer, Romance, 2007
New Moon, Stephenie Meyer, Romance, 2007
<… more output here>
sort_books.c
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
/*
* Following is straight from the project description
*/
#define TITLE_LEN 32
#define AUTHOR_LEN 20
#define SUBJECT_LEN 10
struct book {
char author[AUTHOR_LEN + 1]; /* first author */
char title[TITLE_LEN + 1];
char subject[SUBJECT_LEN + 1]; /* Nonfiction, Fantasy, Mystery, … */
unsignedint year; /* year of e-book release */
};
/*
* Declarations for functions that are defined in other files
*/
STUB: ADD EXTERNAL FUNCTION DECLARATIONS HERE
/*
* Declarations for global variables that need to be accessed
* from other files
*/
STUB: ADD DECLARATIONS FOR YOUR OWN GLOBAL VARIABLES HERE
#define MAX_BOOKS 100
int main(intargc, char **argv) {
struct book books[MAX_BOOKS];
intnumBooks;
/* STUB: ADD CODE HERE TO READ A RECORD AT A TIME FROM STDIN USING scanf()
UNTIL IT RETURNS EOF
*/
for (i = 0; i < MAX_BOOKS; i++) {
/* Sample line: “Breaking Point, Pamela Clare, Romance, 2011” */
/* We are giving you the scanf() format string; note that
* for the string fields, it uses the conversion spec “%##[^,]”,
* where “##” is an actual number. This says to read up to a
* maximum of ## characters (not counting the null terminator!),
* stopping at the first ‘,’ Also note that the first field spec–
* the title field–specifies 80 chars. The title field is NOT
* that large, so you need to read it into a temporary buffer first,
* of an appropriately large size so that scanf() doesn’t overrun it.
* All the other fields should be read directly into the structbook’s
* members.
*/
numFields = scanf(“%80[^,], %20[^,], %10[^,], %u \n”,
STUB: REST OF ARGS TO scanf()???);
if (numFields == EOF) {
numBooks = i;
break;
}
/* Now, process the record you just read.
* First, confirm that you got all the fields you needed (scanf()
* returns the actual number of fields matched).
* Then, post-process title (see project spec for reason)
*/
STUB: VERIFY AND PROCESS THE RECORD YOU JUST READ
}
/* Following assumes you stored actual number of books read into
* varnumBooks
*/
sort_books(books, numBooks);
print_books(books, numBooks);
exit(1);
}
/*
* sort_books(): receives an array of struct book’s, of length
* numBooks. Sorts the array in-place (i.e., actually modifies
* the elements of the array).
*
* This is almost exactly what was given in the pseudocode in
* the project spec–need to replace STUBS with real code
*/
voidsort_books(struct book books[], intnumBooks) {
int i, j, min;
intcmpResult;
for (i = 0; i <numBooks – 1; i++) {
min = i;
for (j = i + 1; j <numBooks; j++) {
/* Copy pointers to the two books to be compared into the
* global variables book1 and book2 for bookcmp() to see
*/
STUB: ADD STUFF HERE
cmpResult = bookcmp();
/* bookcmp returns result in register EAX–above saves
* it into cmpResult */
if (STUB: book2 COMES BEFORE book1) {
min = j;
}
}
if (min != i) {
STUB: SWAP books[i], books[min];
}
}
}
voidprint_books(struct book books[], intnumBooks) {
STUB: ADD CODE HERE TO OUTPUT LIST OF BOOKS
}
Solution
bookcmp.asm
extern book1, book2
; Offsets for fields in struct book.
;
%define AUTHOR_OFFSET 0
%define TITLE_OFFSET 21
%define SUBJECT_OFFSET 54
%define YEAR_OFFSET 68
section .text
globalbookcmp
bookcmp:
pushebx ; preserve register values in the stack
pushecx
pushesi
pushedi
movebx, [book1] ; load address of first structure in ebx
movecx, [book2] ; load address of second structure in ecx
leaesi, [ebx+YEAR_OFFSET] ; load first year pointer in esi
leaedi, [ecx+YEAR_OFFSET] ; load second year pointer in edi
moveax, [esi] ; load the first year value in eax
cmpeax, [edi] ; compare with second year value
jl lower ; if it’s lower, go to lower
jg greater ; if it’s greater, go to greater
; if we are here, the years are equal
; we will compare titles
leaesi, [ebx+TITLE_OFFSET] ; load first title pointer in esi
leaedi, [ecx+TITLE_OFFSET] ; load second title pointer in edi
mov ecx,0 ; we will use ecx as the index
cmploop:
mov al,[esi+ecx] ; load a char from the first title
cmp al,[edi+ecx] ; compare with char at same position in the second title
jl lower ; if it’s lower, go to lower
jg greater ; if it’s greater, go to greater
incecx ; chars were equal, increment index to compare next chars
cmp al,0 ; see if we reached the end of both strings
jnecmploop ; continue comparing characters while we don’t reach the end of the strings
; if we get here all chars were equal until the end of both strings
mov eax,0 ; set a zero return value to indicate equal
jmp return ; exit subroutine
lower:
mov eax,-1 ; set a negative return value to indicate lower
jmp return ; exit subroutine
greater:
mov eax,1 ; set a positive return value to indicate greater
return:
popedi ; restore old register values from stack
popesi
popecx
popebx
ret ; return from subroutine
sort_books.c
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
/*
* Following is straight from the project description
*/
#define TITLE_LEN 32
#define AUTHOR_LEN 20
#define SUBJECT_LEN 10
#define BUFFER_LEN 80
struct book {
char author[AUTHOR_LEN + 1]; /* first author */
char title[TITLE_LEN + 1];
char subject[SUBJECT_LEN + 1]; /* Nonfiction, Fantasy, Mystery, … */
unsignedint year; /* year of e-book release */
};
/*
* Declarations for functions that are defined in other files
*/
externintbookcmp(void);
/*
* Declarations for global variables that need to be accessed
* from other files
*/
struct book *book1;
struct book *book2;
/* prototypes for local functions */
voidsort_books(struct book books[], intnumBooks);
voidprint_books(struct book books[], intnumBooks);
#define MAX_BOOKS 100
int main(intargc, char **argv) {
struct book books[MAX_BOOKS];
intnumBooks;
char buffer[BUFFER_LEN+1];
int i;
intnumFields;
/* STUB: ADD CODE HERE TO READ A RECORD AT A TIME FROM STDIN USING scanf()
UNTIL IT RETURNS EOF
*/
for (i = 0; i < MAX_BOOKS; i++) {
/* Sample line: “Breaking Point, Pamela Clare, Romance, 2011” */
/* We are giving you the scanf() format string; note that
* for the string fields, it uses the conversion spec “%##[^,]”,
* where “##” is an actual number. This says to read up to a
* maximum of ## characters (not counting the null terminator!),
* stopping at the first ‘,’ Also note that the first field spec–
* the title field–specifies 80 chars. The title field is NOT
* that large, so you need to read it into a temporary buffer first,
* of an appropriately large size so that scanf() doesn’t overrun it.
* All the other fields should be read directly into the structbook’s
* members.
*/
numFields = scanf(“%80[^,], %20[^,], %10[^,], %u \n”,
buffer,books[i].author,books[i].subject,&books[i].year);
if (numFields == EOF) {
numBooks = i;
break;
}
/* Now, process the record you just read.
* First, confirm that you got all the fields you needed (scanf()
* returns the actual number of fields matched).
* Then, post-process title (see project spec for reason)
*/
if(numFields!=4)
{
fprintf(stderr,”ERROR: error reading line %d from file, ”
“incorrect number of fields\n”,i+1);
exit(1);
}
if(strlen(buffer)>=TITLE_LEN) /* if the title doesn’t fit in the len*/
{
strncpy(books[i].title,buffer,TITLE_LEN); /* copy only what fits */
books[i].title[TITLE_LEN]=’\0′; /* insert a string terminator */
}
else
strcpy(books[i].title,buffer); /* otherwisem copy entire string */
}
/* Following assumes you stored actual number of books read into
* varnumBooks
*/
sort_books(books, numBooks);
print_books(books, numBooks);
exit(1);
}
/*
* sort_books(): receives an array of struct book’s, of length
* numBooks. Sorts the array in-place (i.e., actually modifies
* the elements of the array).
*
* This is almost exactly what was given in the pseudocode in
* the project spec–need to replace STUBS with real code
*/
voidsort_books(struct book books[], intnumBooks) {
int i, j, min;
intcmpResult;
struct book tmp;
for (i = 0; i <numBooks – 1; i++) {
min = i;
for (j = i + 1; j <numBooks; j++) {
/* Copy pointers to the two books to be compared into the
* global variables book1 and book2 for bookcmp() to see
*/
book1 = &books[j];
book2 = &books[min];
cmpResult = bookcmp();
/* bookcmp returns result in register EAX–above saves
* it into cmpResult */
if (cmpResult<0) {
min = j;
}
}
if (min != i) {
tmp=books[min];
books[min]=books[i];
books[i]=tmp;
}
}
}
voidprint_books(struct book books[], intnumBooks) {
int i;
for (i = 0; i <numBooks; i++) {
printf(“%s, %s, %s, %u\n”,books[i].title,books[i].author,
books[i].subject,books[i].year);
}
}