On any IP header, the first 4 bits are reserved for protocol version. So theoretically a protocol number between 0 and 15 is possible:
4: is already used for IPv4
5: is reserved for the Stream Protocol (STP, RFC 1819 / Internet Stream Protocol Version 2) (which never really made it to the public)
The next free number was 6. Hence IPv6 was born!
During the design of IPv4, people thought that 32 bits were enough for the world. Looking back into the past, 32 bits were enough until now and will perhaps be enough for another few years. However, 32 bits are not enough to provide each network device with a global address in the future. Think about mobile phones, cars (including electronic devices on its CAN-bus), toasters, refrigerators, light switches, and so on...
So designers have chosen 128 bits, 4 times more in length than in IPv4 today.
The usable size is smaller than it may appear however. This is because in the currently defined address schema, 64 bits are used for interface identifiers. The other 64 bits are used for routing. Assuming the current strict levels of aggregation (/48, /32, ...), it is still possible to “run out” of space, but hopefully not in the near future.
See also for more information RFC 1715 / The H Ratio for Address Assignment Efficiency and RFC 3194 / The Host-Density Ratio for Address Assignment Efficiency.
While, there are (possibly) some people (only know about Jim Fleming...) on the Internet who are thinking about IPv8 and IPv16, their design is far away from acceptance and implementation. In the meantime 128 bits was the best choice regarding header overhead and data transport. Consider the minimum Maximum Transfer Unit (MTU) in IPv4 (576 octets) and in IPv6 (1280 octets), the header length in IPv4 is 20 octets (minimum, can increase to 60 octets with IPv4 options) and in IPv6 is 48 octets (fixed). This is 3.4 % of MTU in IPv4 and 3.8 % of MTU in IPv6. This means the header overhead is almost equal. More bits for addresses would require bigger headers and therefore more overhead. Also, consider the maximum MTU on normal links (like Ethernet today): it's 1500 octets (in special cases: 9k octets using Jumbo frames). Ultimately, it wouldn't be a proper design if 10 % or 20 % of transported data in a Layer-3 packet were used for addresses and not for payload.