Text-Index Compression /Analysis in Julia

Amburose Sekar
3 min readJun 14, 2021

A Different Approach !!!

Photo by Walkator on Unsplash

Objective

Fast Index based Text Encode in RGB image Format. this idea behind to one-hot-vector method. In this mixing of Text and Image process useful for fast data conversion and AI process. Implemented in Julia and verify the Response.

Concept

Block Diagram

Working Principle

  1. RGB image consider 24 bits (8 R+8G+8B), so storage capacity index maximum range is 2²⁴-1
  2. Library Data limitation is 2²⁴-1
  3. according to input data, the Library Data index is Searched and store the index value in each position of pixels
  4. Encoded Image Created
  5. in receiver Side, based on reference data , Extract the original info

Implement in Julia

using Images
using ImageMagick

## Ref Text
@time begin

RLen=370103;
dataRef=Array{String,1}(undef,RLen)
f = open(“words_alpha.txt”, “r”)
line_count = 0
for lines in readlines(f)
# increment line_count
global line_count += 1
dataRef[line_count]=lines
if(line_count >= RLen)
break
end
# print the line
end

end
##

Len=256;
data=Array{String,1}(undef,Len)
f = open(“sample.txt”, “r”)
dataS=split.(readlines(f),” “)

Len2=size(dataS[1],1);
data[1:Len2]=dataS[1]

# to count total lines in the file
for lines in readlines(f)
# increment line_count
global line_count += 1
data[line_count]=lines
if(line_count >= Len)
break
end
# print the line
println(lines)
end=#
data1=unique(dataRef);
ind=indexin(data[1:Len2],unique(dataRef))
indtemp=ind;
indtemp=ind[isnothing.(ind).==0]
ind=Int64.([indtemp; zeros(1,Len-length(indtemp))’])
S=digits.(ind, base=2, pad=24)’

R=Array{UInt8,1}(undef,Len)
G=Array{UInt8,1}(undef,Len)
B=Array{UInt8,1}(undef,Len)
sz2=Int8(sqrt(Len))
Img=zeros(sz2,sz2,3)

for z=1:Len
R[z] = sum([S[z][k]*2^(k-1) for k=1:8])
G[z] = sum([S[z][8+k]*2^(k-1) for k=1:8])
B[z] = sum([S[z][16+k]*2^(k-1) for k=1:8])
end
Img[:,:,1]=R./255;
Img[:,:,2]=G./255;
Img[:,:,3]=B./255;

sd=colorview(RGB,AbstractFloat.(reshape(R./255,sz2,sz2)),AbstractFloat.(reshape(G./255,sz2,sz2)) ,AbstractFloat.(reshape(B./255,sz2,sz2)))

display(mosaicview(sd))

save(“EncodeImage.png”, sd)

## Decode
img =load(“EncodeImage.png”)
img2=RGB.(img)
sd1 = channelview(float.(img2))

img_D=channelview(sd1)
img_Dp = permutedims(img_D, (2, 3, 1))
Rs=img_Dp[:,:,1].*255.0
Gs=img_Dp[:,:,2].*255.0
Bs=img_Dp[:,:,3].*255.0

UInt8.(round.(Rs[:]))

indp=Array{Int64,1}(undef,Len);

for z=1:Len
temp=[digits(UInt8(round.(Rs[z])), base=2, pad=8)’ digits(UInt8(round.(Gs[z])), base=2, pad=8)’ digits(UInt8(round.(Bs[z])), base=2, pad=8)’]
#temp1=temp[end:-1:1]
temp2=sum([temp[k]*2^(k-1) for k=1:24])
indp[z]=temp2;

end

println([indp ind])

Special=string.([‘.’; ‘,’; ‘?’ ;’%’])
# create word file
open(“newfile.txt”, “w”) do io
for z=1:Len
if(indp[z] != 0)

ind2=indexin([data1[indp[z]]],Special)

if(isnothing(ind2[1]))
print(io,data1[indp[z]],” “)
else
print(io,data1[indp[z]])
end
end
end
end;

Encode Image based on sample.txt

Future Scope

1.Secure Data Transfer

2. Programming Language Converter

3. Language Translator

4. Text use as Image in Machine learning

Thank You !!!

--

--