Self-describing Sequence/1002

approach ¶

1차 시도 ¶

대략 연습장에 수열이 만들어질 모습을 생각하면서 디자인, 20분 정도 코딩.

~cpp
def selfDescrib(n):
    table = [1,2,2]
    e=2
    while e!=n+1:
        e+=1
        table += [e] * table[e-1]
        if len(table) > n-1:
            return table[n-1]

for i in [100,9999,123456,1000000000]: print selfDescrib(i)

[edit]

2차 시도 ¶

문제는, 1000000000 의 경우에 대해서 답이 나오는 시간이 엄청나게 걸린다는 점이다. 이에 대해서 어떻게 할 것인가 고민,
여러 접근법에 대해서 생각하게 됨.

수열에 대한 일반식이 있을까?
k 값 대비 f(k) 값을 훨씬더 많이 알아낼 수 있을텐데, 이를 이용할 방법이 있을까?

그와 관련하여 프로그래밍을 진행해보았으나 그리 성과가 없음.

[edit]

3차 시도 ¶

어제에 이어서 고민하던 중, 문제점에 대해서 다시 생각. 결국은 f(k) 를 위한 table 생성에서 메모리를 많이 쓴다는 점이 문제.
이에 대해서 다르게 representation 할 방법을 생각하다가 다음과 같이 representation 할 방법을 떠올리게 됨.

f(1) : 1
f(2) ~ f(3) : 2
f(4) ~ f(5) : 3
f(6) ~ f(8) : 4
f(9) ~ f(11) : 5
.
.

이에 대해서 다음과 같이 구현

~cpp
# 해당 숫자 범위를 조사하여 어떤 값이 나올지를 return
def findGroupIdx(table,n,startPos=0):
	for i in xrange(len(table)):
		x,y=table[i]
		if x<=n<=y: return i+1

def selfDescrib2(n):
    table=[(1,1),(2,3)]
    if n==1 or n==2: return n

    for i in xrange(3,n+1):
        start = table[-1][-1] + 1
        theGroupIdx = findGroupIdx(table,i)
        repeat = theGroupIdx
        end = start + repeat - 1
        table.append((start,end))
        if start >= n:
            return findGroupIdx(table,n)

풀고 나니, 그래도 역시 1000000000 에 대해서는 굉장히 느림. 느릴 부분을 생각하던 중 findGroupIdx 부분이
문제임을 생각. 이를 binary search 구현으로 바꿈.

~cpp
def findGroupIdx(table,n,startPos=0):
    midIdx = len(table)/2
    mid = table[midIdx]
    x,y=mid
    if x<=n<=y: return midIdx+1+startPos
    if n<x: return findGroupIdx(table[:midIdx],n,startPos)
    else: return findGroupIdx(table[midIdx+1:],n,startPos+midIdx+1)

binary search 로 바꾸고 나서도 역시 오래걸림. 다른 검색법에 대해서 생각하던 중, findGroupIdx 함수 호출을 할때를 생각.
수열의 값이 늘 증가한다 할때 다음번에 검색하는 경우 앞에서 검색했던 그룹 아니면 그 다음 그룹임을 생각하게 됨.

~cpp
class FindGroupIdxs:
    def __init__(self,table):
        self._table = table
        self._prevSearchPos = 0
    def find(self,n):
        while True:
            x,y=self._table[self._prevSearchPos]
            if x<=n<=y: return self._prevSearchPos+1
            self._prevSearchPos+=1

def selfDescrib2(n):
    table=[(1,1),(2,3)]
    if n==1 or n==2: return n

    finder = FindGroupIdxs(table)

    for i in xrange(3,n+1):
        start = table[-1][-1] + 1
        theGroupIdx = finder.find(i)
        repeat = theGroupIdx
        end = start + repeat - 1
        table.append((start,end))
        if start >= n:
            return finder.find(n)

def main():
    import time
    for e in [100,9999,123456,1000000000]: 
        start=time.time()
        print e,selfDescrib2(e)
        print "time : ", time.time()-start

import psyco
psyco.full()
main()

수행 시간 : 1000000000 기준 0.95초(with psyco) 2.18초(without psyco)
메모리 사용량 : 31MB

[edit]

느낌 ¶

지속적으로 퍼포먼스 튜닝을 위해 다른 알고리즘으로 접근을 해본 점이 좋았음.
하지만, 수학적인 관계를 찾아내는데에는 역시 한계를 보임. 그냥 퍼포먼스 향상을 위한 알고리즘 개선법으로만 접근.

너무 직관에만 의존하여 푼 점이 아쉽긴 함.
다시금 '이 문제에서 요구한 방법이 이 방법이였을까?' 에 대해서 고민하게 됨. 비록 원하는 성능은 나오긴 했지만.