Tag Archives: MySQL

PHP – Import CSV to MySQL

The following code snippet can be used to import CSV into MySQL using PHP.

<?php
$delimiter = ',';

$db = new mysqli('localhost', 'username', 'password', 'database');

if (($handle = fopen("import.csv", "r")) !== FALSE) {
	while (($data = fgetcsv($handle, 1000, $delimiter)) !== FALSE) {
		foreach($data as $i => $content) {
			$data[$i] = $db->real_escape_string($content);
		}
		$db->query("INSERT INTO mytable VALUES('" . implode("','", $data) . "');");
	}
	fclose($handle);
}

?>

A few things to keep in mind:

  • Make sure you change the MySQL login information to your own login information
  • Make sure you change the table name, in this case, mytable, to your own table name.
  • For your CSV (comma separated values) file, the delimiter may not necessarily be a comma. Change the $delimiter variable to the one-character delimiter separating the different fields.
  • This does NOT automatically create a table. You have to manually do that.
  • Every row would be inserted, so you should manually remove the first row if it is just the column names.

Randomness of MySQL Rand

A few days ago, I needed to use MySQL’s rand() function. Some rows seemed to be picked more often than others, so I decided to test the randomness of the function. I tested this function in PHP using the MySQLi module.

$db = new mysqli('host', 'user', 'password', 'database');

To begin with, I inserted 100 rows, with numbers 1 to 100, using the following script:

for($i = 1; $i <= 100; $i++) {
     $query = $db->query("INSERT INTO numbers ( number ) VALUES ( $i );");
}

I used the following PHP snippet to test and record the randomness. Each round was ran 1000 times, and there were 10 rounds.

for($a = 0; $a < 10; $a++) {
    for($b = 0; $b < 1000; $b++) {         $query = $db->query("SELECT number FROM numbers ORDER BY RAND() LIMIT 1");
        $result = $query->fetch_assoc();
        $results[$a][$result['number']]++;
    }
    ksort($results[$a]);
}

Results
The easy way would be to use print_r to output the results. However, I wanted the final results in CSV format, so I could graph it in Excel.

for($a = 1; $a <= 100; $a++) {
    $str .= $a;
    for($b = 0; $b < 10; $b++) {
        $str .= ',' . $results[$b][$a];
    }
    $str .= "n";
}
echo $str;

Average

MySQL Rand Average

Although the difference is minor, there is actually a large difference since the rand() function was ran a total of 10,000 times. Since this graph shows the average number of times each number was picked among the 10 trials, the difference between 8 and 12 is 40 in reality. Here’s a graph showing the total number of times each number was picked out of 10,000.

Total Number Picked

Even though the above graphs doesn’t show any “dramatic” difference, one interesting thing to look at is the actual results for each round. Here are the results for the last 3 rounds.

Last 3 Rounds

As you can see, among the individual rounds, there are dramatic difference between the numbers. I’m hoping to find an alternative to MySQL’s rand function (perhaps use PHP’s rand function instead?). If you have any good alternatives, let me know!